Search code examples
pythonpandasdatetimeindexingreindex

How to fix what dates your dataframe includes


I have a dataframe whereby I'm trying to get data from today (-5) days until the end of next month.

In the case of today this would be;

ix = pd.DatetimeIndex(start=datetime(2020, 6, 05), end=datetime(2020, 7, 31), freq='D')
df.reindex(ix)

If I wanted to automate this is there any function I can take advantage of?

I've tried

startdate = pd.to_datetime('today') - pd.DateOffset(days=5)
enddate = pd.to_datetime('today', format) + MonthEnd(2)

ix = pd.DatetimeIndex(start=startdate, end=enddate, freq='D')
df.reindex(ix)

..but does now seem to be working. Any help appreciated!


Solution

  • Your code was close, I think the main issue is you are constructing a DatetimeIndex incorrectly: it doesn't take a start or end parameter (see docs). Also agree with @MrFuppes about format being unnecessary.

    I think you want pandas.date_range, which still returns a DatetimeIndex

    startdate = pd.to_datetime('today') - pd.DateOffset(days=5)
    enddate = pd.to_datetime('today') + pd.tseries.offsets.MonthEnd(2)
    
    ix = pd.date_range(startdate,enddate,freq='D')
    

    This seems to work for me, and can be used for reindexing. You could also call the floor method on your start and end dates if you just want to get dates without the specific time of day the code was run:

    ix = pd.date_range(startdate.floor('d'),enddate.floor('d'),freq='D')