Search code examples
pythonhtmlpython-2.7beautifulsouphtml-parsing

BeautifulSoup4: select elements where attributes are not equal to x


I'd like to do something like this:

soup.find_all('td', attrs!={"class":"foo"})

I want to find all td that do not have the class of foo.
Obviously the above doesn't work, what does?


Solution

  • BeautifulSoup really makes the "soup" beautiful and easy to work with.

    You can pass a function in the attribute value:

    soup.find_all('td', class_=lambda x: x != 'foo')
    

    Demo:

    >>> from bs4 import BeautifulSoup
    >>> data = """
    ... <tr>
    ...     <td>1</td>
    ...     <td class="foo">2</td>
    ...     <td class="bar">3</td>
    ... </tr>
    ... """
    >>> soup = BeautifulSoup(data)
    >>> for element in soup.find_all('td', class_=lambda x: x != 'foo'):
    ...     print element.text
    ... 
    1
    3