I got an error message when I try to scrape the "Sector" from a ticker in Yahoo finance.
I tried to follow the library manual advising to select the correct parent and child info from the HTLM page, but I could not capture the "sector" of the ticker (for example: for the AAPL
ticker the sector is Technology
):
Below is the sector's html code (view source):
Here is my attempt:
from bs4 import BeautifulSoup
import requests
url = 'https://finance.yahoo.com/quote/AAPL/profile?p=AAPL'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
sector_element = soup.find('span', text='Sector(s)').find_next('span', class_='Fw(600)')
print(sector_element.text)
I was expecting to get "Technology"
but instead got the following error message:
AttributeError: 'NoneType' object has no attribute 'find_next'
Try to set correct User-Agent
HTTP header to get right response from server:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0'}
url = 'https://finance.yahoo.com/quote/AAPL/profile?p=AAPL'
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
sector_element = soup.find('span', string='Sector(s)').find_next('span', class_='Fw(600)')
print(sector_element.text)
Prints:
Technology