Big newbie here. I'm using pandas_datareader and yfinance to create an excel workbook with many stocks in it, but recently I have been getting more and more "failed to read symbol" errors, for stocks that definitely still exist. Usually I can just run the code again and it'll work, but the problem seems to slowly be getting worse. I've heard it suggested that maybe yahoo doesn't like to get that many requests at once, and perhaps a time.sleep method could be the solution.
But I'm enough of a beginner that I just don't know how to write that code. Here's my code below; I want the exact same thing just with the delay, I guess between each stock.
import pandas_datareader as web
from datetime import datetime
import pandas as pd
start = datetime(2020,9,1)
end = datetime(2021,12,31)
stock = [ 'ACHC', 'ACIW', [many more stocks listed here cut for simplicity], 'TECH', 'SAM']
df = web.DataReader(stock, 'yahoo', start, end)
df.to_excel(f'excelfilename.xlsx')
Thanks in advance
I also tried to run your code. I don't know why, but it gives me an error, so I installed the yahoo finance library and fixed the code. I created an empty data frame and added each stock to it.
import pandas_datareader.data as web
from datetime import datetime
import pandas as pd
import yfinance as yf
start = datetime(2020,9,1)
end = datetime(2021,12,31)
stocks= ['ACHC','ACIW','TECH','SAM']
dfs = pd.DataFrame()
for stock in stocks:
print(stock)
df = yf.download(stock, start, end)
df['stock'] = stock
dfs = dfs.append(df, ignore_index=True)
dfs.to_excel(f'excelfilename.xlsx')
dfs.head()
Open | High | Low | Close | Adj Close | Volume | stock | |
---|---|---|---|---|---|---|---|
0 | 31.13 | 31.42 | 30.6 | 30.91 | 30.91 | 385700 | ACHC |
1 | 29.73 | 30.99 | 29.73 | 30.57 | 30.57 | 226500 | ACHC |
2 | 30.76 | 31.2 | 30.41 | 31.12 | 31.12 | 379300 | ACHC |
3 | 31.15 | 31.5 | 29.92 | 30.18 | 30.18 | 411200 | ACHC |
4 | 30.7 | 30.82 | 29.2 | 29.81 | 29.81 | 459500 | ACHC |
EDIT: If you want to arrange them horizontally by stock, get the stocks one at a time in a list and change the level of the hierarchical index by deleting unnecessary columns. Then convert the hierarchical index name to a single line and rename the columns. After that, reorder the columns.
df = yf.download(stocks, start, end)
df.drop(['Adj Close',], axis=1, inplace=True)
df = df.swaplevel(1, 0, axis=1)
df.columns = ["_".join(a) for a in df.columns.to_flat_index()]
df.sort_index(axis=1, inplace=True)
Date | ACHC_Close | ACHC_High | ACHC_Low | ACHC_Open | ACHC_Volume |
---|---|---|---|---|---|
2020-08-31 00:00:00 | 30.91 | 31.42 | 30.6 | 31.13 | 385700 |
2020-09-01 00:00:00 | 30.57 | 30.99 | 29.73 | 29.73 | 226500 |
2020-09-02 00:00:00 | 31.12 | 31.2 | 30.41 | 30.76 | 379300 |
2020-09-03 00:00:00 | 30.18 | 31.5 | 29.92 | 31.15 | 411200 |
2020-09-04 00:00:00 | 29.81 | 30.82 | 29.2 | 30.7 | 459500 |