python web-scraping beautifulsoup finance yahoo-finance

Scrape ticker sector information from Yahoo finance

I got an error message when I try to scrape the "Sector" from a ticker in Yahoo finance. I tried to follow the library manual advising to select the correct parent and child info from the HTLM page, but I could not capture the "sector" of the ticker (for example: for the AAPL ticker the sector is Technology):

Below is the sector's html code (view source): Yahoo website page and html code

Here is my attempt:

from bs4 import BeautifulSoup
import requests

url = 'https://finance.yahoo.com/quote/AAPL/profile?p=AAPL'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
sector_element = soup.find('span', text='Sector(s)').find_next('span', class_='Fw(600)')
print(sector_element.text)

I was expecting to get "Technology" but instead got the following error message:

AttributeError: 'NoneType' object has no attribute 'find_next'

Solution

Try to set correct User-Agent HTTP header to get right response from server:

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/111.0'}

url = 'https://finance.yahoo.com/quote/AAPL/profile?p=AAPL'
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, 'html.parser')
sector_element = soup.find('span', string='Sector(s)').find_next('span', class_='Fw(600)')
print(sector_element.text)

Prints:

Technology