Search code examples
pythonpython-3.xbeautifulsoupweb-crawlerindex-error

Web Crawler Array error: "list index out of range"


I am not too strong in Python but I am building a site for a guild I am a part of in a game, and I am using a crawler to pull some of our members data off of another site (yes I did receive permission to do so). I am using beautiful soup 4 with python 3.7. I am receiving the error:

Traceback (most recent call last):
  File "/Users/UsersLaptop/Desktop/swgohScraper.py", line 21, in <module>
    temp = members[count]
IndexError: list index out of range

My Code is Here:

from requests import get
from bs4 import BeautifulSoup
# variables
count = 1

# lists to store data
names = []
gp = []
arenaRank = []

url = 'https://swgoh.gg/g/21284/gid-1-800-druidia/'
response = get(url)

soup = BeautifulSoup(response.text, 'html.parser')
type(soup)

members = soup.find_all('tr')
members.sort()

for users in members:
    temp = members[count]
    name = temp.td.a.strong.text
    names.append(name)
    count += 1

print(names)

I am guessing I am receiving this error due to the fact that members has 50 members in it but the 50th is null, and I would need to stop the array from appending if the data was null however when I tried putting an if loop under my for loop such as:

if users.find('tr') is not None:

it does not fix the issue. It would be greatly appreciated if someone could explain how to solve this issue, and why the solution works. Thank you in advance!


Solution

  • This would do the job of what you trying to get from the code i.e trying to get the names as could be infered from the code

    from requests import get
    
    from bs4 import BeautifulSoup
    
    # variables
    count = 1
    
    # lists to store data
    names = []
    gp = []
    arenaRank = []
    
    url = 'https://swgoh.gg/g/21284/gid-1-800-druidia/'
    response = get(url)
    
    soup = BeautifulSoup(response.content, 'html.parser')
    
    for users in soup.findAll('strong'):
        if users.text.strip().encode("utf-8")!= '':
            names.append(users.text.strip().encode("utf-8"))
    
    
    
    print(names)