Search code examples
pythonjupyter-notebookdatetime-formatpython-datetime

remove specific characters from time stamp


I have a rather odd time format and the data frame df with the column TIMESTAMP:

2016-10-25T09:34:52.051713+01:00
2016-10-25T09:46:14.051620+01:00
2016-10-25T09:51:16.052435+01:00
....

but I need to use the data as time information. In the first place, I would like to get rid of the last 13 characters such that it looks like

2016-10-25T09:34:52
2016-10-25T09:46:14
2016-10-25T09:51:16
....

To do so, I tried

df['TIMESTAMP'] = df.apply(lambda x: x['TIMESTAMP'][:-13], axis = 1)

from Remove ends of string entries in pandas DataFrame column

but I receive the error:

TypeError: string indices must be integers

which I do not understand. I'm just fresh in python but I don't see much I've done wrong?


Solution

  • The strict answer to your question is to use the str accessor, which can take a slice, just like a normal string, and which will apply that slice to every value in the Series:

    data = ['2016-10-25T09:34:52.051713+01:00',
            '2016-10-25T09:46:14.051620+01:00',
            '2016-10-25T09:51:16.052435+01:00']
    
    s = pd.Series(data)
    
    print(s.str[:-13])
    

    Output:

    0    2016-10-25T09:34:52
    1    2016-10-25T09:46:14
    2    2016-10-25T09:51:16
    

    However, I think what you want is actually pd.to_datetime, which will (within reason) infer the format of your data and convert it into datetime objects:

    print(pd.to_datetime(s))
    

    Output:

    0   2016-10-25 09:34:52.051713+01:00
    1   2016-10-25 09:46:14.051620+01:00
    2   2016-10-25 09:51:16.052435+01:00
    dtype: datetime64[ns, pytz.FixedOffset(60)]