Search code examples
pythonfunctionloopsbeautifulsoupscraper

Python issue: TypeError: unhashable type: 'slice' during web scraping


I am attempting to scrape some info from a website. I was able to successfully scrape the text that i was looking for, but when I try to create a function to append the texts together, i get a TypeError of an unhashable type.

Do you know what may be happening here? Does anybody know how to fix this issue?

Here is the code in question:

records = []
for result in results:
    name = result.contents[0][0:-1]

and here is the code in entirety, for reproducing purposes:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://skinsalvationsf.com/2012/08/updated-comedogenic-ingredients-list/')
soup = BeautifulSoup(r.text, 'html.parser')
results = soup.find_all('td', attrs={'valign':'top'})

records = []
for result in results:
    name = result.contents[0][0:-1]

A sample of results items:

<td valign="top" width="33%">Acetylated Lanolin <sup>5</sup></td>,
<td valign="top" width="33%">Coconut Butter<sup> 8</sup></td>,
...
<td valign="top" width="33%"><sup> </sup></td>

Thanks in advance!!


Solution

  • In some of your collected results the contents contains no text, but only Tag objects, so you get a TypeError when trying to select a slice from the Tag's attributes dictionary.

    You can catch such errors with a try-except block,

    for result in results:
        try:
            name = result.contents[0][0:-1]
        except TypeError:
            continue
    

    Or you could use .strings to select only NavigableString contents,

    for result in results:
        name = list(result.strings)[0][0:-1]
    

    But it seems like it's only the last item that has no text content, so you could just ignore it.

    results = soup.find_all('td', attrs={'valign':'top'})[:-1]
    
    for result in results:
        name = result.contents[0][:-1]