Search code examples
pythonweb-scrapingwebbeautifulsouppython-requests

How can I export text from a specific div with class "swatch-option text" using Python and BeautifulSoup?


I'm trying to scrape shoe sizes from a website using Python and BeautifulSoup. The shoe sizes are located in a div with the class "swatch-option text." I've already managed to extract other information like shoe names and prices, but I'm having trouble getting the shoe sizes.

Here is my current code:

import requests
from bs4 import BeautifulSoup

# Define the URL to scrape
url = "https://www.fuel.com.gr/el/catalogsearch/result/index/?gender=5431%2C5432&q=Dunk"

# Fetch the HTML content of the page
response = requests.get(url)
html = response.text

# Parse the HTML content
soup = BeautifulSoup(html, 'html.parser')

# Find all <li> elements with class "item product product-item"
shoe_items = soup.find_all('li', class_="item product product-item")

for shoe_item in shoe_items:
    shoe_name = shoe_item.find('strong', class_='product name product-item-name').text.strip()
    shoe_price = shoe_item.find('span', class_='price').text.strip()

    # Check if the 'data-mage-init' script exists before extracting data
    data_mage_init = shoe_item.find('script', {'type': 'text/x-magento-init'})
    if data_mage_init:
        # Fetch the individual product page URL
        product_link = data_mage_init['data-mage-init']
        print("Name:", shoe_name)
        print("Price:", shoe_price)
        print("Product Link:", product_link)

        # Fetch the individual product page
        product_response = requests.get(product_link)
        product_html = product_response.text
        product_soup = BeautifulSoup(product_html, 'html.parser')

        # Extract and print the available shoe sizes using the specified CSS selector
        shoe_sizes = [size.text.strip() for size in product_soup.select('#amasty-shopby-product-list .text')]
        print("Sizes:", ', '.join(shoe_sizes))
        print("\n")
    else:
        # Handle the case where 'data-mage-init' is not found
        print("Name:", shoe_name)
        print("Price:", shoe_price)
        print("Sizes: No sizes available")
        print("\n")

I would like to know how to modify this code to extract shoe sizes from the "swatch-option text" div. Any help or guidance on this issue would be greatly appreciated. Thanks!

enter image description here


Solution

  • To extract sizes, use CSS selector "div.swatch-attribute.size", I checked them using the following template:

    enter image description here