I've just started to try and code a price tracker with Python, and have already ran into an error I don't understand. This is the code:
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Corsair-Platinum-Mechanical-Keyboard-Backlit/dp/B082GR814B/'
HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0."
"4103.116 Safari/537.36"}
targetPrice = 150
def getPrice():
page = requests.get(URL, headers=HEADERS)
soup = BeautifulSoup(page.content, 'html.parser')
price = soup.find(id="priceblock_ourprice").get_text() # Error happens here
print(price)
if True:
getPrice()
I see this part soup.find(id="priceblock_ourprice") returns a value of 'None' thus the AttributeError. I don't understand why it returns a 'None' value. Only ONCE did the code actually work and printed the product price, and never again. I ran the script again after the single successful attempt without changing anything, and got the AttributeError consistantly again.
I've also tried the following:
Used html5lib and lxml instead of html.parser. Different id's, to see if I can access different parts of a site. Other User Agents. I also downloaded a similar program from github that uses the exact same code to see if it would run, but it didn't either.
What is happening here? Any help would be appreciated.
You're getting captcha page. Try to set more HTTP headers as in browser to get correct page. When I set Accept-Language
http header I cannot reproduce the error anymore:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Corsair-Platinum-Mechanical-Keyboard-Backlit/dp/B082GR814B/'
HEADERS = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0",
'Accept-Language': 'en-US,en;q=0.5',
}
def getPrice():
page = requests.get(URL, headers=HEADERS)
soup = BeautifulSoup(page.content, 'html.parser')
price = soup.find(id="priceblock_ourprice").get_text()
print(price)
getPrice()
Prints:
$195.99