Search code examples
python-3.xpandasfinanceyahoo-financeyfinance

downloading yahoofinance data for date ranges


I have a dataframe with 900 tickers in it, and I'm trying to download adjusted close prices -15 business days before said date through 30 business days after said date, for a total of ~9 weeks of data.

    date        symbol  date_start  date_end
0   2020-03-20  USAU    2020-02-28  2020-05-01
1   2020-03-20  INLX    2020-02-28  2020-05-01
4   2020-03-20  SIML    2020-02-28  2020-05-01
6   2020-03-19  NLNK    2020-02-27  2020-04-30
7   2020-03-19  OMWS    2020-02-27  2020-04-30
9   2020-03-19  LUMO    2020-02-27  2020-04-30
11  2020-03-18  T.TO    2020-02-26  2020-04-29
12  2020-03-18  TU      2020-02-26  2020-04-29
15  2020-03-16  SUMR    2020-02-24  2020-04-27
17  2020-03-16  DTYS    2020-02-24  2020-04-27

I tried to filter on a start date and an end date with -15/+30 threshold for yahoo finance, but I keep getting returned null. I would try the mass download feature, df = yf.download(tickers = [list_of_tickers],...) but it requires explicitly start dates and end dates, and I'm not sure how to get around that. It's returning a pandas dataframe, so I'm not sure if that's the problem explicitly.

I'm aware of this from the documentation, but there's no solution to my problem in there.: How to deal with multi-level column names downloaded with yfinance

Code is below for brainstorming, but I am stuck and don't know how to navigate this problem

data_stocks = []

for symbol in data_shortened['symbol']:
    try: 
        placeholder = yf.download(symbol, data_shortened['start_date'], data_shortened['end_date'])
        data_stocks.append(placeholder)
    
    except:
        print('Failed')
        print(len(data_stocks))

Thank you for any help.


Solution

  • You can use iterrows:

    data_stocks = {}
    
    for _, row in data_shortened.iterrows():
            placeholder = yf.download(row['symbol'], row['date_start'], row['date_end'])
            data_stocks[row['symbol']] = placeholder
    
    stocks = pd.concat(data_stocks)
    

    Output:

    >>> stocks
                      Open   High    Low  Close  Adj Close   Volume
         Date                                                      
    USAU 2020-02-27    7.1    7.5    6.5    6.9        6.9  23460.0
         2020-02-28    6.5    6.8    6.0    6.3        6.3  22010.0
         2020-03-02    6.5    6.6    6.0    6.1        6.1   7300.0
         2020-03-03    6.4    6.5    6.0    6.5        6.5   9700.0
         2020-03-04    6.5    7.0    6.1    7.0        7.0  13340.0
    ...                ...    ...    ...    ...        ...      ...
    DTYS 2020-04-20  120.0  120.0  120.0  120.0      120.0      0.0
         2020-04-21  120.0  120.0  120.0  120.0      120.0      0.0
         2020-04-22  120.0  120.0  120.0  120.0      120.0      0.0
         2020-04-23  120.0  120.0  120.0  120.0      120.0      0.0
         2020-04-24  120.0  120.0  120.0  120.0      120.0      0.0
    
    [358 rows x 6 columns]