Search code examples
pythondataframefinance

Why does the filtered date range differ from my start date?


I use this code to get BTC value but the date is starting previous day which my selected.

INPUT:

tickers=['BTC-USD']  # Name of asset
tarih="02-06-2021"
tarih2="05-06-2021"
start=dt.datetime.strptime(tarih, '%d-%m-%Y')
end=dt.datetime.strptime(tarih2, '%d-%m-%Y')
returns=pd.DataFrame()
liste=[]                    

for ticker in tickers:
        data=web.DataReader(ticker,'yahoo',start,end)
        data=pd.DataFrame(data)
        data[ticker]=data['Adj Close'] #can work with change percentage in order to get more accurate data
        if returns.empty:
            returns=data[[ticker]]
        else:
            returns = returns.join(data[[ticker]],how='outer')#add right column
for dt in daterange(start, end):
    dates=dt.strftime("%d-%m-%Y")
    with open("fng_value.txt", "r") as filestream:
            for line in filestream:
                date = line.split(",")[0]
                if dates == date:
                    fng_value=line.split(",")[1]
                    liste.append(fng_value)
print(returns.head(25))

OUTPUT:

                 BTC-USD
Date                    
2021-06-01  37575.179688

2021-06-02  39208.765625

2021-06-03  36894.406250

2021-06-04  35551.957031

2021-06-05  35862.378906

Solution

  • DataReader accepts a start parameter as a string, date, or datetime. Apparently, sometimes using start date (e.g. 2021-06-02) retrieves data starting from the previous day on 2021-06-01. Try to use a datetime with timezone and an hour late in the day to hack the date if it doesn't return what you expect it to.

    See if this works:

    import pandas_datareader.data as web
    import pandas as pd
    from pytz import timezone
    from datetime import datetime, date
    
    tarih = "02-06-2021"
    tarih2 = "05-06-2021"
    
    # start/end can be date, datetime, or string
    
    #start = date(2021, 6, 2)
    #end = date(2021, 6, 5)
    #start = 'JUN-02-2021'
    #end = 'JUN-05-2021'
    
    start = datetime.strptime(tarih, '%d-%m-%Y').replace(hour=23, tzinfo=timezone('EST'))
    end = datetime.strptime(tarih2, '%d-%m-%Y').replace(tzinfo=timezone('EST'))
    
    tickers=['BTC-USD']  # Name of asset
    
    for ticker in tickers:
        data = web.DataReader(ticker, 'yahoo', start, end)
        data = pd.DataFrame(data)
        print(data)
    

    This returns the data from 6/2 to 6/5.