Search code examples
pythonweb-scrapingfinancestockyahoo-finance

How to scrape/extract specific Balance Sheet fields from Yahoo Finance from a list of stock tickers?


I have a list of tickers in a file named stocktickers.csv (shown below):

Tickers
AMD
AAPL
FB
MSFT
GOOG

I would like to scrape Yahoo Finance balance sheet data from that list and input it in the stocktickers.csv file like below. "Tangible Book Value" and "Shared Issued" are fields found on each ticker's Balance Sheet web page, such as AMD's: https://finance.yahoo.com/quote/AMD/balance-sheet?p=AMD.

Tickers Tangible_BV Shares_Issued
AMD 1,000,000 500,000
AAPL 2,000,000 200,000
FB 3,000,000 300,000
MSFT 500,000 50,000
GOOG 4,000,000 400,000

This is what I have so far, which scrapes the Tangible Book Value for all years.

from bs4 import BeautifulSoup
import requests


header = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'
}

url = 'https://finance.yahoo.com/quote/AAPL/balance-sheet?p=AAPL'

response = requests.get(url, headers=header)

html = response.text
soup = BeautifulSoup(html, "html.parser")
main = soup.find("div", {"data-reactid": "195"}) #or 196
divs = main.find_all("div")

for div in divs:
    span = div.find("span")
    try:
        print(span.text)
    except:
        pass

Result:

   Tangible Book Value
Tangible Book Value
65,339,000
90,488,000
107,147,000
126,032,000

If there is a way to use get_balance_sheet() (from yfinance module) to scrape specific balance sheet fields such as those above, that would be great too.


Solution

  • Try financialmodelingprep API. You can get that information from a JSON request very easily so you dont have to scrape it.