Search code examples
pythonscreen-scrapingbeautifulsoup

Beautiful Soup cannot find a CSS class if the object has other classes, too


if a page has <div class="class1"> and <p class="class1">, then soup.findAll(True, 'class1') will find them both.

If it has <p class="class1 class2">, though, it will not be found. How do I find all objects with a certain class, regardless of whether they have other classes, too?


Solution

  • Just in case anybody comes across this question. BeautifulSoup now supports this:

    Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)]
    Type "copyright", "credits" or "license" for more information.
    
    In [1]: import bs4
    
    In [2]: soup = bs4.BeautifulSoup('<div class="foo bar"></div>')
    
    In [3]: soup(attrs={'class': 'bar'})
    Out[3]: [<div class="foo bar"></div>]
    

    Also, you don't have to type findAll anymore.