Search code examples
pythonclassbeautifulsoupgoogle-finance

Get company name from a Google Finance page with Python


I would like to print the company name from the Google Finance page, using the div class appbar-snippet-primary. The code I am usng returns none or []. Wasn't able to get to the span tag containing the company name using beautifulsoup.

html = urlopen('https://www.google.com/finance?q=F')
soup = BeautifulSoup(html, "html.parser")
x = soup.find(id='appbar-snippet-primary')
print(x)

Thank you for the explanation. I have updated the code as you suggested and included the stock price, created a loop, then stored the information in a dictionary.

from bs4 import BeautifulSoup
import requests

x = ('F', 'GE', 'GOOGL')
Company = {}

for i in x:
    head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64)  AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
    html = requests.get('https://www.google.com/finance?q=%s' % (i) ,   headers=head).content
    soup = BeautifulSoup(html, "html.parser")
    c = soup.find("div", class_="appbar-snippet-primary").text
    p = soup.find('span',class_='pr').span.text
    Company.update({c : p})
for k, v in Company.items():
print('{:<30} {:>8}'.format(k,v))

Solution

  • The value is not dynamically generated by Javascript, it is in the source, all you need to do is add a user-agent and use the correct tag name, the following example using requests gets what you want:

    from bs4 import BeautifulSoup
    
    import requests
    
    head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
    html = requests.get('https://www.google.com/finance?q=F', headers=head).content
    soup = BeautifulSoup(html, "html.parser")
    x = soup.find("div", class_="appbar-snippet-primary")
    print(x)
    

    Which returns:

    <div class="appbar-snippet-primary"><span>Ford Motor Company</span></div>
    

    If we run the code using x.text to pull the text you can see the output is correct:

    In [14]: from bs4 import BeautifulSoup
    
    In [15]: import requests
    
    In [16]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
    
    In [17]: html = requests.get('https://www.google.com/finance?q=F', headers=head).content
    
    In [18]: soup = BeautifulSoup(html, "html.parser")
    
    In [19]: x = soup.find("div", class_="appbar-snippet-primary")
    
    In [20]: print(x.text)
    Ford Motor Company
    

    Now without a user-agent:

    In [21]: from bs4 import BeautifulSoup
    
    In [22]: import requests
    
    In [23]: html = requests.get('https://www.google.com/finance?q=F').content
    
    In [24]: soup = BeautifulSoup(html, "html.parser")
    
    In [25]: x = soup.find("div", class_="appbar-snippet-primary")
    
    In [26]: print(x)
    None
    

    And x is None as you don't get the same source returned.