Search code examples
pythonpandaspandas-datareader

Pandas DataReader: normalizing dates


I use pandas data reader package to pull economic time series from website like fred, yahoo finance. I have pulled us recession (USREC) series from the 'fred' website and historical sp500 (^GSPC) from yahoo finance.

Historical US recession:

web.DataReader("USREC", "fred", start, end)

Output:

2017-08-01      0
2017-09-01      0
2017-10-01      0
2017-11-01      0

S&P500 returns

web.DataReader("^GSPC",'yahoo',start,end)['Close'].to_frame().resample('M').mean().round()

Output:

2017-08-31  2456.0
2017-09-30  2493.0
2017-10-31  2557.0
2017-11-30  2594.0

I want to merge the two data frames, but one has beginning date of the month and other has ending date of the month. How do I make a) the date column yyyy-mm b) either make the date column of both frames month beginning or month end?

Thanks for the help!


Solution

  • You can use MS for resample by start of months:

    web.DataReader("^GSPC",'yahoo',start,end)['Close'].to_frame().resample('MS').mean().round()
    

    Or is possible use to_period for month PeriodIndex:

    df1 = df1.to_period('M')
    df2 = df2.to_period('M')
    print (df1)
             Close
    2017-08      0
    2017-09      0
    2017-10      0
    2017-11      0
    
    print (df2)
              Close
    2017-08  2456.0
    2017-09  2493.0
    2017-10  2557.0
    2017-11  2594.0
    
    print (df1.index)
    PeriodIndex(['2017-08', '2017-09', '2017-10', '2017-11'], dtype='period[M]', freq='M')
    
    print (df2.index)
    PeriodIndex(['2017-08', '2017-09', '2017-10', '2017-11'], dtype='period[M]', freq='M')