I'm creating a price comparison website for my uni project. I'm trying to print the items and the price from this website https://www.lotuss.com.my/en/category/fresh-produce?sort=relevance:DESC but I got this error:
Exception has occurred: TypeError
'NoneType' object is not callable
File "C:\xampp\htdocs\Price\test.py", line 36, in <module>
grocery_items = soup.findall('div', class_='product-grid-item')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable
This is the code:
from bs4 import BeautifulSoup
import requests
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
service = Service(executable_path='C:/chromedriver/chromedriver.exe')
driver = webdriver.Chrome(service=service, options=chrome_options)
# Open the webpage
driver.get('https://www.lotuss.com.my/en/category/fresh-produce?sort=relevance:DESC')
# Wait for the page to fully load
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "iframe"))
)
print("Please solve the CAPTCHA manually in the opened browser window.")
finally:
input("Press Enter after solving the CAPTCHA...")
html_text = driver.page_source
driver.quit()
soup = BeautifulSoup(html_text, 'lxml')
grocery_items = soup.findall('div', class_='product-grid-item')
grocery_price = soup.findall('span', class_='sc-kHxTfl hwpbzy')
print(grocery_items)
print(grocery_price)
The error is caused by using an incorrect method for the soup object. The method should be find_all
instead of findall
.
Code should be as below:
grocery_items = soup.find_all('div', class_='product-grid-item')
grocery_price = soup.find_all('span', class_='sc-kHxTfl hwpbzy')
I tested your code and there are few more issues. Once you have fixed the error asked in this question, you may notice nothing gets printed on the console. Do as below:
By default, selenium doesn't open browser in full screen, this may result in elements being not visible sometimes and may not locate all the targeted elements. Hence open chrome in full screen using below code:
driver.get('https://www.lotuss.com.my/en/category/fresh-produce?sort=relevance:DESC')
driver.maximize_window()
The last line of your code is just printing the HTML
print(grocery_items)
print(grocery_price)
Instead you need to print the text values of the HTML. Use code as below:
for item in grocery_items:
print(item.get_text(strip=True))
for price in grocery_price:
print(price.get_text(strip=True))