Im using the python yfinance yahoo API for stock data retrieval. Right now im getting the peg ratio, which is an indicator of a company price related to its growth and earnings. I have a csv downloaded from here: https://www.nasdaq.com/market-activity/stocks/screener. It has exactly 8000 stocks.
What I do is get the symbol list, and iterate it to access to the yahoo ticker. Then I get a use the ticker.info method which returns a dictionary. I iterate this process through the 8000 symbols. It goes at a speed of 6 symbols per minute, which is not viable. Is there a faster way with another API or another structure? I dont care about the API as long as I can get basic info as the growth, earnings, EPS and those things.
Here is the code:
import pandas as pd
import yfinance as yf
data = pd.read_csv("data/stock_list.csv")
symbols = data['Symbol']
for symbol in symbols:
stock = yf.Ticker(symbol)
try:
if stock.info['pegRatio']:
print(stock.info['shortName'] + " : " + str(stock.info['pegRatio']))
except KeyError:
pass
It seems that when certain data are needed from the Ticker.info attribute, HTTP requests are made to acquire them. Multithreading will help to improve matters. Try this:-
import pandas as pd
import yfinance as yf
import concurrent.futures
data = pd.read_csv('data/stock_list.csv')
def getPR(symbol):
sn = None
pr = None
try:
stock = yf.Ticker(symbol)
pr = stock.info['pegRatio']
sn = stock.info['shortName']
except Exception:
pass
return (sn, pr)
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = {executor.submit(getPR, sym): sym for sym in data['Symbol']}
for future in concurrent.futures.as_completed(futures):
sn, pr = future.result()
if sn:
print(f'{sn} : {pr}')