Search code examples
pythonhtmlweb-scrapingbeautifulsoup

Beautifulsoup unable to extract data using attrs=class


I am extracting data for a research project and I have sucessfully used findAll('div', attrs={'class':'someClassName'}) in many websites but this particular website,

WebSite Link

doesn't return any values when I used attrs option. But when I don't use the attrs option I get entire html dom.

Here is the simple code that I started with to test it out:

soup = bs(urlopen(url))
for div in soup.findAll('div', attrs={'class':'data'}):
    print div

Solution

  • My code is working fine, with requests

    import requests
    from BeautifulSoup import BeautifulSoup as bs
    #grab HTML
    r = requests.get(r'http://www.amazon.com/s/ref=sr_pg_1?rh=n:172282,k%3adigital%20camera&keywords=digital%20camera&ie=UTF8&qid=1343600585')
    html = r.text
    #parse the HTML
    soup = bs(html)
    
    results= soup.findAll('div', attrs={'class': 'data'})
    
    print results