Search code examples
pythonweb-scrapingbeautifulsoupattributeerror

AttributeError when webscraping


Received AttributeError when web-scraping but i am unsure what i a doing wrong? what does AttributeError mean?

    response_obj = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
    soup = BeautifulSoup(response_obj,'lxml')
    Population_Census_Table = soup.find('table', {'class':'wikitable sortable'})

preparation of the table

    rows = Population_Census_Table.select("tbody > tr")[3:8]

    jurisdiction = []

    for row in rows:
        jurisdiction = {}
        tds = row.select('td')
        jurisdiction["jurisdiction"] = tds[0].text.strip()
        jurisdiction["population_census"] = tds[1].text.strip()
        jurisdiction["%_white"] = float(tds[2].text.strip().replace(",",""))
        jurisdiction["%_black_or_african_amercian"] = float(tds[3].text.strip().replace(",",""))
        jurisdiction["%_Asian"] = float(tds[4].text.strip().replace(",",""))
        jurisdiction["%_other"] = float(tds[5].text.strip().replace(",",""))
        jurisdiction["%_mixed_race"] = float(tds[6].text.strip().replace(",",""))
        jurisdiction["%_hispanic_latino_of_other_race"] = float(tds[7].text.strip().replace(",",""))
        jurisdiction["%_catholic"] = float(tds[7].text.strip().replace(",",""))
        jurisdiction["%_jewish"] = float(tds[8].text.strip().replace(",",""))
    
        jurisdiction.append(jurisdiction)

` `print(jurisdiction)


 

AttributeError

   ---> 18     jurisdiction.append(jurisdiction)
   AttributeError: 'dict' object has no attribute 'append'

Solution

  • You start with jurisdiction as a list and immediately make it as a dict. You then treat as a dict until the error line where you try to treat it again as a list. I think you need another name for the list at the start. Possibly you meant jurisdictions (plural) as list. However, IMO there are two other areas that also definitely need fixing:

    1. find returns a single table. The labels/keys in your dict indicate you want to a later table (not the first match)

    2. Your indexing is incorrect for the target table

    You want something like:

    import requests, re
    from bs4 import BeautifulSoup
    
    response_obj = requests.get('https://en.wikipedia.org/wiki/Demographics_of_New_York_City').text
    soup = BeautifulSoup(response_obj,'lxml')
    Population_Census_Table = soup.select_one('.wikitable:nth-of-type(5)') #use css selector to target correct table.
    jurisdictions = []
    rows = Population_Census_Table.select("tbody > tr")[3:8]
    for row in rows:
        jurisdiction = {}
        tds = row.select('td')
        jurisdiction["jurisdiction"] = tds[0].text.strip()
        jurisdiction["population_census"] = tds[1].text.strip()
        jurisdiction["%_white"] = float(tds[2].text.strip().replace(",",""))
        jurisdiction["%_black_or_african_amercian"] = float(tds[3].text.strip().replace(",",""))
        jurisdiction["%_Asian"] = float(tds[4].text.strip().replace(",",""))
        jurisdiction["%_other"] = float(tds[5].text.strip().replace(",",""))
        jurisdiction["%_mixed_race"] = float(tds[6].text.strip().replace(",",""))
        jurisdiction["%_hispanic_latino_of_other_race"] = float(tds[7].text.strip().replace(",",""))
        jurisdiction["%_catholic"] = float(tds[10].text.strip().replace(",",""))
        jurisdiction["%_jewish"] = float(tds[12].text.strip().replace(",",""))
        jurisdictions.append(jurisdiction)