Search code examples
pythonfor-loopbeautifulsouptypeerrornonetype

Python: "if i.find('a')['id'] is not None:" returns TypeError 'NoneType' object is not subscriptable, but print() returns a value


What I'm trying to do:

I'm trying to write a script that scrapes a website for product information.

Currently, the program uses a for-loop to scrape for a product price and a unique ID.

The for-loop contains two if-statements to stop it from scraping NoneTypes.

import requests
from bs4 import BeautifulSoup


def average(price_list):
    return sum(price_list) / len(price_list)


# Requests search data from Website
page_link = 'URL'
page_response = requests.get(page_link, timeout=5)  # gets the webpage (search) from Website
page_content = BeautifulSoup(page_response.content, 'html.parser')  # turns the webpage it just retrieved into a BeautifulSoup-object

# Selects the product listings from page content so we can work with these
product_listings = page_content.find_all("div", {"class": "unit flex align-items-stretch result-item"})

prices = []  # Creates a list to add the prices to
uids = [] # Creates a list to store the unique ids

for product in product_listings:

## UIDS 
    if product.find('a')['id'] is not None:
        uid = product.find('a')['id']
        uids.append(uid)

# PRICES
    if product.find('p', class_ = 'result-price man milk word-break') is not None:# assures that the loop only finds the prices
        price = int(product.p.text[:-2].replace(u'\xa0', ''))  # makes a temporary variable where the last two chars of the string (,-) and whitespace are removed, turns into int
        prices.append(price)  # adds the price to the list

The problem:

On if product.find('a')['id'] is not None:, I get a Exception has occurred: TypeError 'NoneType' object is not subscriptable.

Whoever, if I run print(product.find('a')['id']), I get the value I'm looking for, which make me really confused. Don't that mean that the error is not a NoneType?

Also, if product.find('p', class_ = 'result-price man milk word-break') is not None: works flawlessly.

What I've tried:

I've tried assigning if product.find('p', class_ = 'result-price man milk word-break')to an variable and then running it in the for-loop, but that did not work. I've also made my fair share of googling, but to no prevail. The problem there might be that I'm relatively new to programming and don't know exactly what to search for, but I've still found a lot of answers that seem to be to related problems, but that won't work in my code.

Any help would be greatly appreciated!


Solution

  • Just make an intermediate step:

    res = product.find('a')
    
    if res is not None and res['id'] is not None:
        uids.append(res['id'])
    

    That way, if find returns None because the item was not found, you will not end up trying to subscript NoneType.