I am trying to build a function for looping through to subsequent pages of a website and scraping data from each. I am getting the following nonetype error and I don't know how to get around it:
if not page.find('a', {'class': 'btn btn-default current disabled'}): AttributeError: 'NoneType' object has no attribute 'find'
from bs4 import *
import time
import pandas as pd
import pickle
import html5lib
from requests_html import HTMLSession
s = HTMLSession()
url = "https://cryptoli.st/lists/fixed-supply"
def get_data(url):
r = s.get(url)
global soup
soup = BeautifulSoup(r.text, 'html.parser')
return soup
def get_next_page(data):
page = soup.find('ul', {'class': 'pager'})
if not page.find('a', {'class': 'btn btn-default current disabled'}):
url = 'https://cryptoli.st/lists/fixed-supply' + \
str(page.find('li', {'class': 'paginate_button'}).find(
'a')[{'class': 'btn btn-default next'}])
return url
else:
return
data = get_data(url)
print(get_next_page(soup))
Any help would be greatly appreciated.
From BeautifulSoup docs:
If find() can’t find anything, it returns None:
print(soup.find("nosuchtag")) # None
That means that most likely this:
soup.find('ul', {'class': 'pager'})
returns None. Make sure that such element exists in the site you're trying to parse. It could be that the static HTML doesn't include it, and the list is populated dynamically.
Indeed, if in chrome you go to view-source:https://cryptoli.st/lists/fixed-supply
, you will see that there is no <ul class="pager"
anywhere.