Search code examples
pythonpandasdataframestring-to-datetime

Change dataframe column names from string format to datetime


I have a dataframe where the names of the columns are dates (Year-month) in the form of strings. How can I convert these names in datetime format? I tried doing this:

new_cols = pd.to_datetime(df.columns)
df = df[new_cols]

but I get the error:

KeyError: "DatetimeIndex(
['2000-01-01', '2000-02-01',
 '2000-03-01', '2000-04-01',
 '2000-05-01', '2000-06-01', 
'2000-07-01', '2000-08-01',               
'2000-09-01', '2000-10-01',
'2015-11-01', '2015-12-01', 
'2016-01-01', '2016-02-01',
'2016-03-01', '2016-04-01', 
'2016-05-01', '2016-06-01',
'2016-07-01', '2016-08-01'],
dtype='datetime64[ns]', length=200, freq=None) not in index"

Thanks!


Solution

  • If select by loc columns values was not changed, so get KeyError.

    So you need assign output to columns:

    df.columns = pd.to_datetime(df.columns)
    

    Sample:

    cols = ['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01']
    vals = np.arange(5)
    df = pd.DataFrame(columns = cols, data=[vals])
    print (df)
       2000-01-01  2000-02-01  2000-03-01  2000-04-01  2000-05-01
    0           0           1           2           3           4
    
    print (df.columns)
    Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')
    
    df.columns = pd.to_datetime(df.columns)
    
    print (df.columns)
    DatetimeIndex(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01',
                   '2000-05-01'],
                  dtype='datetime64[ns]', freq=None)
    

    Also is possible convert to period:

    print (df.columns)
    Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')
    
    df.columns = pd.to_datetime(df.columns).to_period('M')
    
    print (df.columns)
    PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05'],
                 dtype='period[M]', freq='M')