Search code examples
pythonpandasdataframetimestampstring-to-datetime

How to concatenate pandas dataframe date and different time formats to single timestamp?


I have two columns in a pandas data frame as outlined below. Notice how some of the EVENT_TIME is in hh.mm.ss, some is in hh:mm:ss AM/PM format.

enter image description here

When running...

import pandas

df['EVENT_DATE'] = pd.to_datetime(df['EVENT_DATE'], format='%Y%m%d')

print(df['EVENT_DATE'])

...I can get EVENT_DATE in a consumable (for my purposes) format (e.g. 1999-07-28).

But when running...

df['EVENT_TIME'] = pd.to_datetime(df['EVENT_TIME'], format='%H.%M.%S', errors='coerce')
df['EVENT_TIME'] = pd.to_datetime(df['EVENT_TIME'], format='%I:%M:%S %p', errors='coerce')

print(df['EVENT_TIME'])

...1900-01-01 is added to the times and is not being applied to all rows.

1900-01-01 16:40:00
1900-01-01 15:55:00
1900-01-01 14:30:00
1900-01-01 13:26:00
NaT
NaT
NaT
NaT

How do I concatenate the date and times (which include multiple time formats) in a single timestamp?

Edit1:

@Wen-Ben 's solution got me here:

1      19:53:00
11     14:30:00
15     16:30:00

Then to concatenate EVENT_DATE and EVENT_TIME, I found this (which works):

df['TIMESTAMP'] = df.apply(lambda r : pd.datetime.combine(r['EVENT_DATE'], r['EVENT_TIME']),1)

...results in:

1     1999-07-28 19:53:00
11    2001-07-28 14:30:00
15    2002-06-07 16:30:00

Next I want to get this into ISO8601 format. So I found this (which works):

pd.to_datetime(df['TIMESTAMP']).apply(lambda x: x.strftime('%Y%m%dT%H:%M%SZ'))

...results in:

1      19990728T19:5300Z
11     20010728T14:3000Z
15     20020607T16:3000Z

HERES MY NEW PROBLEM:

Running print(TIMESTAMP) still shows the concatenated versions (e.g. 1999-07-28 19:53:00) instead of the ISO version (e.g.19990728T19:5300Z)

How do I get the ISO8601 column "added" to the dataframe?

Ideally, I want it to take the place of TIMESTAMP. I want it as a transformation of the data, not tacked on as a new column.


Solution

  • Using fillna

    s1=pd.to_datetime(df['EVENT_TIME'], format='%H.%M.%S', errors='coerce')
    s2=pd.to_datetime(df['EVENT_TIME'], format='%I:%M:%S %p', errors='coerce')
    df['EVENT_TIME']=s1.fillna(s2)