Search code examples
pythonweb-scrapingbeautifulsoupurllib2

How to get text from inside a <span class="className">TEXT I WANT</span> in Python


How to get the text as per the Title which amounts to 21,427 as per the screenshot below.

I tried this and it did not work:

rating_count = soup.find("span", attrs={'class':'rating_count'})
print rating_count

enter image description here

This is the output

enter image description here


Solution

  • This will do exactly what you are looking for.

    from BeautifulSoup import BeautifulSoup
    
    data='<span class="rating-count">TEXT I WANT</span>'
    soup=BeautifulSoup(data)
    t=soup.find('span',{'class':'rating-count'})
    print t.text
    

    EDITED:

    According the code you have provided. It looks like that as there is not a header defined, google doesn't send the information that you are looking for. Consequently, BeautifulSoup could't find the span because it didn't exist actually. Try this, it works for me:

    pkg = "com.mavdev.focusoutfacebook"
    url = "https://play.google.com/store/apps/details?id=" + pkg
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    data = opener.open(url).read()
    
    soup=BeautifulSoup(data)
    
    t=soup.find('span',{'class':'rating-count'})
    print t.text
    

    Result:

    >>> 
    1,397