I'm a newbie with python. In PyCharm I wrote this code:
import requests
from bs4 import BeautifulSoup
response = requests.get(f"https://www.google.com/search?q=fitness+wear")
soup = BeautifulSoup(response.text, 'html.parser')
print(soup)
Instead getting the HTML of the search results, what I get is the HTML of the following page
I use the same code within a script on pythonanywhere.com and it works perfectly. I've tried lots of the solutions I found but the result is always the same, so now I'm stuck with it.
I think this should work:
import requests
from bs4 import BeautifulSoup
with requests.Session() as s:
url = f"https://www.google.com/search?q=fitness+wear"
headers = {
"referer":"referer: https://www.google.com/",
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
}
s.post(url, headers=headers)
response = s.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
print(soup)
It uses a request session and a post request to create any initial cookies (not fully sure on this) and then allows you scrape.