Webscraping - Adding a If Statement if a "Nontype" object has no attribute

Thanks for reading! For my project what I am doing is scrolling through company annual reports to pull names of board members and positions. Because different companies have different formats I would like to try a method to scrape information, and if that process results in a "Nontype" error (because one method does not find attributes or a keyword), to move to a different method and try that method. I just need a way to say if there is a nontype error, try the next method. Below is one method that results in an error.

tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
resticker = []
for row in tables_ticker.find_all("tr")[1:]:
    #print([cell.get_text(strip=True) for cell in row.find_all("td")])
    if row:
        resticker.append([cell.get_text(strip=True) for cell in row.find_all("td")])
        non_empty_ticker = [sublist for sublist in resticker if any(sublist)]
        df_ticker = pd.DataFrame.from_records(non_empty_ticker)
        df_ticker[df_ticker == ''] = np.nan
        df_ticker=df_ticker.dropna(axis=1, how='all')

print(df_ticker)

Error:

Traceback (most recent call last): File "C:/Users/james/PycharmProjects/untitled2/Edgar/WMT Working.py", line 84, in tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table") AttributeError: 'NoneType' object has no attribute 'find_parent'

Solution

Here's a simple example you can apply to your code:

for item in ["Hello", "World", None, "Foo", None, "Bar"]:
    print(item.upper())

Output:

HELLO
WORLD
Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'upper'
>>>

As you can see, once the for-loop reaches the third item in the list (which is not a string, it's a NoneType object), an exception is raised because NoneType objects don't have an upper method. This worked for the first two iterations because strings do have an upper method.

Solution - use a try-except block:

for item in ["Hello", "World", None, "Foo", None, "Bar"]:
    try:
        print(item.upper())
    except AttributeError:
        continue

Output:

HELLO
WORLD
FOO
BAR
>>>

We encapsulated the line of code which can throw a potential AttributeError with a try-except block. If the line of code raises such an exception, we use the continue keyword to skip this iteration of the loop and move on to the next item in the list.

In the same way, you can encapsulate this line:

tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")

With a try-except block. Instead of using continue inside a loop, however, you can switch scraping formats.