I have a data frame with two columns : the event and the date.
created_at event
0 2020-11-16 13:41:34 meeting-created
1 2020-11-16 13:49:52 meeting-ended
2 2020-11-16 14:01:36 meeting-created
3 2020-11-16 15:16:24 meeting-ended
I want to calculate the total duration of the meeting so I need to subtract the two first dates and then the last two. Knowing that there may be more lines in the dataframe.
I believe you need if there is always pairs subtract filtered value with convert second Series
to numpy array:
df['created_at'] = pd.to_datetime(df['created_at'])
s1 = df.loc[df['event'].eq('meeting-ended'), 'created_at']
s2 = df.loc[df['event'].eq('meeting-created'), 'created_at']
df['new'] = s1.sub(s2.to_numpy())
print (df)
created_at event new
0 2020-11-16 13:41:34 meeting-created NaT
1 2020-11-16 13:49:52 meeting-ended 0 days 00:08:18
2 2020-11-16 14:01:36 meeting-created NaT
3 2020-11-16 15:16:24 meeting-ended 0 days 01:14:48