python class beautifulsoup google-finance

Get company name from a Google Finance page with Python

I would like to print the company name from the Google Finance page, using the div class appbar-snippet-primary. The code I am usng returns none or []. Wasn't able to get to the span tag containing the company name using beautifulsoup.

html = urlopen('https://www.google.com/finance?q=F')
soup = BeautifulSoup(html, "html.parser")
x = soup.find(id='appbar-snippet-primary')
print(x)

Thank you for the explanation. I have updated the code as you suggested and included the stock price, created a loop, then stored the information in a dictionary.

from bs4 import BeautifulSoup
import requests

x = ('F', 'GE', 'GOOGL')
Company = {}

for i in x:
    head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64)  AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
    html = requests.get('https://www.google.com/finance?q=%s' % (i) ,   headers=head).content
    soup = BeautifulSoup(html, "html.parser")
    c = soup.find("div", class_="appbar-snippet-primary").text
    p = soup.find('span',class_='pr').span.text
    Company.update({c : p})
for k, v in Company.items():
print('{:<30} {:>8}'.format(k,v))

Solution

The value is not dynamically generated by Javascript, it is in the source, all you need to do is add a user-agent and use the correct tag name, the following example using requests gets what you want:

from bs4 import BeautifulSoup

import requests

head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
html = requests.get('https://www.google.com/finance?q=F', headers=head).content
soup = BeautifulSoup(html, "html.parser")
x = soup.find("div", class_="appbar-snippet-primary")
print(x)

Which returns:

<div class="appbar-snippet-primary"><span>Ford Motor Company</span></div>

If we run the code using x.text to pull the text you can see the output is correct:

In [14]: from bs4 import BeautifulSoup

In [15]: import requests

In [16]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}

In [17]: html = requests.get('https://www.google.com/finance?q=F', headers=head).content

In [18]: soup = BeautifulSoup(html, "html.parser")

In [19]: x = soup.find("div", class_="appbar-snippet-primary")

In [20]: print(x.text)
Ford Motor Company

Now without a user-agent:

In [21]: from bs4 import BeautifulSoup

In [22]: import requests

In [23]: html = requests.get('https://www.google.com/finance?q=F').content

In [24]: soup = BeautifulSoup(html, "html.parser")

In [25]: x = soup.find("div", class_="appbar-snippet-primary")

In [26]: print(x)
None

And x is None as you don't get the same source returned.