Search code examples
pythonhtmlurllib2

Python retrieving value from URL


I'm trying to write a python script that checks money.rediff.com for a particular stock price and prints it. I know that this can be done easily with their API, but I want to learn how urllib2 works, so I'm trying to do this the old fashioned way. But, I'm stuck on how to use the urllib. Many tutorials online asked me to the "Inspect element" of the value I need to return and split the string to get it. But, all the examples in the videos have the values with easily to split HTML Tags, but mine has it in something like this:

<div class="f16">
<span id="ltpid" class="bold" style="color: rgb(0, 0, 0); background: rgb(255, 255, 255);">6.66</span> &nbsp; 
<span id="change" class="green">+0.50</span> &nbsp; 

<span id="ChangePercent" style="color: rgb(130, 130, 130); font-weight: normal;">+8.12%</span>
</div>

I only need the "6.66" in Line2 out. How do I go about doing this? I'm very very new to Urllib2 and Python. All help will be greatly appreciated. Thanks in advance.


Solution

  • You can certainly do this with just urllib2 and perhaps a regular expression, but I'd encourage you to use better tools, namely requests and Beautiful Soup.

    Here's a complete program to fetch a quote for "Tata Motors Ltd.":

    from bs4 import BeautifulSoup
    import requests
    
    html = requests.get('http://money.rediff.com/companies/Tata-Motors-Ltd/10510008').content
    
    soup = BeautifulSoup(html, 'html.parser')
    quote = float(soup.find(id='ltpid').get_text())
    
    print(quote)
    

    EDIT

    Here's a Python 2 version just using urllib2 and re:

    import re
    import urllib2
    
    html = urllib2.urlopen('http://money.rediff.com/companies/Tata-Motors-Ltd/10510008').read()
    
    quote = float(re.search('<span id="ltpid"[^>]*>([^<]*)', html).group(1))
    
    print quote