Search code examples
python-3.xweb-scrapingbeautifulsouphrefnonetype

Check a variable for NoneType and break a while loop


I am very new to Programming and started teaching myself web-scraping with Python. I am scraping player data from multiple pages of a site and built a while loop which scrapes a 'next'-button's href to get to the next player's page. Everything is working out fine, except breaking the while loop after the last player available. The 'next'-button will gray out and have no link behind it, therefore I want to stop the iteration and save everything to a csv.

My script looks like this:

#name base url and first page to start

BaseUrl = #url
PageUrl = #also url

while True:

  #scraping tables

  try:
      # retrieve link for 'next' player in order
      link = soup.find(attrs={"class": "go_to_next_player"}).get('href')
      # join base url and new link href
      PageUrl = BaseUrl + link
      if link is None:
          break
  except IndexError as e:
      print(e)
      break

#writing to csv

I thought I could check if the retrieved href is empty, therefore checking 'is None' and breaking, but I get this error:

In line > PageUrl = BaseUrl + link
TypeError: must be str, not NoneType

Help would be greatly appreciated! I am very new to this, so please disregard my beginner code.


Solution

  • You can check if link is None before doing any operations with it, and then break the loop:

    if link is not None:
        PageUrl = BaseUrl + link
    else:
        break