Search code examples
pythonweb-scrapingwebbeautifulsouppython-requests

Why is the 'Growth Estimates' table not being detected by beautifulsoup on this website?


I tried to webscrape the data from the below url to get the data from the "Growth Estimates" table using beautiful soup & requests but it can't seem to pick the table up. However when using the inspection tool I can see there is a table there to pull data from and I couldn't see anything about it being pulled dynamically, but I could be wrong.

url = https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL

Is someone able to explain the issue and offer a solution?

Thank you!

import requests
from bs4 import BeautifulSoup

def get_growth_data(symbol):
    url = "https://finance.yahoo.com/quote/{symbol}/analysis?p={symbol}"
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")

    # Find the table containing the growth data
    table = soup.find("table", class_="W(100%) M(0) BdB Bdc($seperatorColor) Mb(25px)")

    if table is None:
        print("Table not found.")
        return []

    # Extract the growth values from the table
    growth_values = []
    rows = table.find_all("tr")
    for row in rows:
        columns = row.find_all("td")
        if len(columns) >= 2:
            growth_values.append(columns[1].text)

    return growth_values

symbol = 'AAPL'
growth_data = get_growth_data(symbol)
print(growth_data)


Solution

  • To get correct response from the server set User-Agent HTTP header in your request:

    import pandas as pd
    import requests
    from bs4 import BeautifulSoup
    
    url = 'https://finance.yahoo.com/quote/AAPL/analysis?p=AAPL'
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0'}
    soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
    
    table = soup.select_one('table:-soup-contains("Growth Estimates")')
    df = pd.read_html(str(table))[0]
    
    print(df)
    

    Prints:

               Growth Estimates    AAPL  Industry  Sector(s)  S&P 500
    0              Current Qtr.  -0.80%       NaN        NaN      NaN
    1                 Next Qtr.   5.40%       NaN        NaN      NaN
    2              Current Year  -2.30%       NaN        NaN      NaN
    3                 Next Year   9.90%       NaN        NaN      NaN
    4  Next 5 Years (per annum)   8.02%       NaN        NaN      NaN
    5  Past 5 Years (per annum)  23.64%       NaN        NaN      NaN