Thanks for reading! For my project what I am doing is scrolling through company annual reports to pull names of board members and positions. Because different companies have different formats I would like to try a method to scrape information, and if that process results in a "Nontype" error (because one method does not find attributes or a keyword), to move to a different method and try that method. I just need a way to say if there is a nontype error, try the next method. Below is one method that results in an error.
tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
resticker = []
for row in tables_ticker.find_all("tr")[1:]:
#print([cell.get_text(strip=True) for cell in row.find_all("td")])
if row:
resticker.append([cell.get_text(strip=True) for cell in row.find_all("td")])
non_empty_ticker = [sublist for sublist in resticker if any(sublist)]
df_ticker = pd.DataFrame.from_records(non_empty_ticker)
df_ticker[df_ticker == ''] = np.nan
df_ticker=df_ticker.dropna(axis=1, how='all')
print(df_ticker)
Error:
Traceback (most recent call last): File "C:/Users/james/PycharmProjects/untitled2/Edgar/WMT Working.py", line 84, in tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table") AttributeError: 'NoneType' object has no attribute 'find_parent'
Here's a simple example you can apply to your code:
for item in ["Hello", "World", None, "Foo", None, "Bar"]:
print(item.upper())
Output:
HELLO
WORLD
Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'upper'
>>>
As you can see, once the for-loop reaches the third item in the list (which is not a string, it's a NoneType
object), an exception is raised because NoneType
objects don't have an upper
method. This worked for the first two iterations because strings do have an upper
method.
Solution - use a try-except block:
for item in ["Hello", "World", None, "Foo", None, "Bar"]:
try:
print(item.upper())
except AttributeError:
continue
Output:
HELLO
WORLD
FOO
BAR
>>>
We encapsulated the line of code which can throw a potential AttributeError
with a try-except block. If the line of code raises such an exception, we use the continue
keyword to skip this iteration of the loop and move on to the next item in the list.
In the same way, you can encapsulate this line:
tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
With a try-except block. Instead of using continue
inside a loop, however, you can switch scraping formats.