Search code examples
pythonweb-scrapingbeautifulsouppython-requests

How do I fix my code, it is returning an empty list?


I am scraping an ecommerce website and its returning an empty list

This is the code I wrote.

import requests
from bs4 import BeautifulSoup

baseurl = 'https://www.thewhiskyexchange.com/'

headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like     Gecko) Chrome/126.0.0.0 Safari/537.36 Edg/126.0.0.0'}

r = requests.get('https://www.thewhiskyexchange.com/c/35/japanese-whisky')
soup = BeautifulSoup(r.content, 'lxml')

productlist = soup.find_all("li", class_="product-grid__item")
print(productlist)

this is the result I am getting

[]

Solution

  • As mentioned in the comment, you are getting blocked. However, you can get around this by using Playwright.

    Install the playwright package and then install Chromium.

    pip3 install playwright
    playwright install chromium
    

    The code below will grab the names and URL paths for the products selected via the CSS you provided.

    from playwright.sync_api import sync_playwright
    from bs4 import BeautifulSoup
    
    URL = "https://www.thewhiskyexchange.com/c/35/japanese-whisky"
    
    playwright = sync_playwright().start()
    
    browser = playwright.chromium.launch(headless=False, slow_mo=2000)
    context = browser.new_context(
        viewport={"width": 1280, "height": 900}
    )
    
    page = context.new_page()
    
    page.goto(URL)
    
    page.locator(".product-grid__list").wait_for()
    
    soup = BeautifulSoup(page.content(), 'lxml')
    
    for product in soup.select("li.product-grid__item > a"):
        name = product.select_one("p.product-card__name").text
        url = product["href"]
        print(f"{name} | {url}")
    
    page.close()
    

    If you don't want to see the browser window then set headless=True.

    Sample output:

    Hibiki Harmony | /p/29388/hibiki-harmony
    Nikka Coffey Grain Whisky | /p/23928/nikka-coffey-grain-whisky
    Hakushu Distiller's Reserve | /p/23771/hakushu-distillers-reserve
    Yamazaki 12 Year Old | /p/2940/yamazaki-12-year-old
    Suntory Toki | /p/36362/suntory-toki
    Yamazaki Distiller's Reserve | /p/23772/yamazaki-distillers-reserve
    Ichiro's Malt MWRMizunara Wood Reserve | /p/46186/ichiros-malt-mwr-mizunara-wood-reserve
    Chichibu Red Wine Cask 2023 | /p/80949/chichibu-red-wine-cask-2023
    Kanosuke Single Malt | /p/72178/kanosuke-single-malt
    Hibiki 21 Year Old | /p/10134/hibiki-21-year-old
    Fuji Single Malt Whisky | /p/72434/fuji-single-malt-whisky
    Yamazaki 18 Year OldGift Box | /p/81705/yamazaki-18-year-old-gift-box
    The Chita Distiller's Reserve | /p/44794/the-chita-distillers-reserve
    Yoichi Single Malt | /p/32761/yoichi-single-malt
    Kaiyo Mizunara Oak Cask Strength | /p/45367/kaiyo-mizunara-oak-cask-strength
    Mars Tsunuki2022 Edition | /p/71682/mars-tsunuki-2022-edition
    Kanosuke Single Malt 2022 Limited Edition | /p/71032/kanosuke-single-malt-2022-limited-edition
    Nikka Miyagikyo PeatedDiscovery Series 2021 | /p/61449/nikka-miyagikyo-peated-discovery-series-2021
    Miyagikyo Single Malt | /p/32762/miyagikyo-single-malt
    Yamazaki 12 Year Old100th Anniversary | /p/71847/yamazaki-12-year-old-100th-anniversary
    Hakushu 12 Year Old | /p/2922/hakushu-12-year-old
    Chichibu The Peated 2022 | /p/70564/chichibu-the-peated-2022
    Kanosuke Double Distillery Blended Whisky | /p/81245/kanosuke-double-distillery-blended-whisky
    Kanosuke Hioki Pot Still | /p/81244/kanosuke-hioki-pot-still