I have two columns in a pandas
data frame as outlined below. Notice how some of the EVENT_TIME
is in hh.mm.ss
, some is in hh:mm:ss AM/PM
format.
When running...
import pandas
df['EVENT_DATE'] = pd.to_datetime(df['EVENT_DATE'], format='%Y%m%d')
print(df['EVENT_DATE'])
...I can get EVENT_DATE
in a consumable (for my purposes) format (e.g. 1999-07-28
).
But when running...
df['EVENT_TIME'] = pd.to_datetime(df['EVENT_TIME'], format='%H.%M.%S', errors='coerce')
df['EVENT_TIME'] = pd.to_datetime(df['EVENT_TIME'], format='%I:%M:%S %p', errors='coerce')
print(df['EVENT_TIME'])
...1900-01-01
is added to the times and is not being applied to all rows.
1900-01-01 16:40:00
1900-01-01 15:55:00
1900-01-01 14:30:00
1900-01-01 13:26:00
NaT
NaT
NaT
NaT
How do I concatenate the date and times (which include multiple time formats) in a single timestamp?
Edit1:
@Wen-Ben 's solution got me here:
1 19:53:00
11 14:30:00
15 16:30:00
Then to concatenate EVENT_DATE and EVENT_TIME, I found this (which works):
df['TIMESTAMP'] = df.apply(lambda r : pd.datetime.combine(r['EVENT_DATE'], r['EVENT_TIME']),1)
...results in:
1 1999-07-28 19:53:00
11 2001-07-28 14:30:00
15 2002-06-07 16:30:00
Next I want to get this into ISO8601 format. So I found this (which works):
pd.to_datetime(df['TIMESTAMP']).apply(lambda x: x.strftime('%Y%m%dT%H:%M%SZ'))
...results in:
1 19990728T19:5300Z
11 20010728T14:3000Z
15 20020607T16:3000Z
HERES MY NEW PROBLEM:
Running print(TIMESTAMP)
still shows the concatenated versions (e.g. 1999-07-28 19:53:00
) instead of the ISO version (e.g.19990728T19:5300Z
)
How do I get the ISO8601 column "added" to the dataframe?
Ideally, I want it to take the place of TIMESTAMP
. I want it as a transformation of the data, not tacked on as a new column.
Using fillna
s1=pd.to_datetime(df['EVENT_TIME'], format='%H.%M.%S', errors='coerce')
s2=pd.to_datetime(df['EVENT_TIME'], format='%I:%M:%S %p', errors='coerce')
df['EVENT_TIME']=s1.fillna(s2)