I'm trying to scrape a website, the results were as expected if I run my code on my own local server, but if I deploy to a GCP VM, some of the HTML tags are missing. I've made sure that the source code is the same both locally and on GCP.
Of interest is the fact that if I change my headers, then I get more missing tags. So far, I've found that these headers work the best:
headers = {
"User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1 Edg/87.0.4280.141",
"Content-Type": "application/x-www-form-urlencoded",
"Connection": "keep-alive"}
Is the missing tags problem caused by the headers being sent, or by something else happening in the GCP VM?
To recap troubleshooting done in comments:
You can find more information about scraping from GCP here.