I am looking to do something like in this thread. However, I only want to subtract the time
component of the two datetime
columns.
For eg., given this dataframe:
ts1 ts2
0 2018-07-25 11:14:00 2018-07-27 12:14:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00
The expected output for ts2 -ts1
time component only should give:
ts1 ts2 ts_delta
0 2018-07-25 11:14:00 2018-07-27 12:14:00 1:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 -1:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -0:17:00
So, for row 0
: the time for ts2
is 12:14:00
, the time for ts1
is 11:14:00
. The expected output is just these two times subtracting (don't care about the days
). In this case:
12:14:00
- 11:14:00
= 1:00:00
.
How would I do this in one single line?
You need to set both datetimes to a common date first.
One way is to use pandas.DateOffset
:
o = pd.DateOffset(day=1, month=1, year=2022) # the exact numbers don't matter
# reset dates
ts1 = df['ts1'].add(o)
ts2 = df['ts2'].add(o)
# subtract
df['ts_delta'] = ts2.sub(ts1)
As one-liner:
df['ts_delta'] = df['ts2'].add((o:=pd.DateOffset(day=1, month=1, year=2022))).sub(df['ts1'].add(o))
Other way using a difference between ts2-ts1 (with dates) and ts2-ts1 (dates only):
df['ts_delta'] = (df['ts2'].sub(df['ts1'])
-df['ts2'].dt.normalize().sub(df['ts1'].dt.normalize())
)
output:
ts1 ts2 ts_delta
0 2018-07-25 11:14:00 2018-07-27 12:14:00 0 days 01:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 -1 days +23:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -1 days +23:43:00
NB. don't get confused by the -1 days +23:00:00
, this is actually the ways to represent -1hour