Search code examples
pythonpandasdatedatetimepandas-groupby

Pandas groupby find difference between two date time columns


I have the following dataset,

id      date1                date2              location
1   2019-06-25 19:15:00  2019-06-25 19:15:00       A
1   2019-06-25 20:35:00  2019-06-25 20:36:00       B
1   2019-06-25 22:15:00  2019-06-26 19:00:00       C
2   2019-06-26 21:15:00  2019-06-26 21:41:00       A
2   2019-06-26 23:29:00  2019-06-25 19:15:00       B
2   2019-06-26 23:30:00  2019-06-27 00:37:00       C

I am trying to create a new column that calculates time difference in minutes by doing (date2 - date1), where the date1 is always from the next row (shift(1)).

Expected output,

id      date1                date2              location    difference
1   2019-06-25 19:15:00  2019-06-25 19:15:00       A           NAN
1   2019-06-25 20:35:00  2019-06-25 20:36:00       B           80
1   2019-06-25 22:15:00  2019-06-26 19:00:00       C           99
2   2019-06-26 21:15:00  2019-06-26 21:41:00       A           NAN
2   2019-06-26 23:29:00  2019-06-26 23:29:00       B           108
2   2019-06-26 23:30:00  2019-06-27 00:37:00       C           1

I tried using group, but gives me the wrong output. But without groupby is where i am at,

df['difference'] = ((((df['date1'] - 
                    df['date2'].shift(1)).dt.seconds)/60))

Solution

  • Use:

    diff_shift = lambda x: x['date1'].sub(x['date2'].shift()).dt.total_seconds().div(60)
    df['difference'] = df.groupby('id').apply(diff_shift).droplevel(0)
    print(df)
    
    # Output
       id               date1               date2 location  difference
    0   1 2019-06-25 19:15:00 2019-06-25 19:15:00        A         NaN
    1   1 2019-06-25 20:35:00 2019-06-25 20:36:00        B        80.0
    2   1 2019-06-25 22:15:00 2019-06-26 19:00:00        C        99.0
    3   2 2019-06-26 21:15:00 2019-06-26 21:41:00        A         NaN
    4   2 2019-06-26 23:29:00 2019-06-25 19:15:00        B       108.0
    5   2 2019-06-26 23:30:00 2019-06-27 00:37:00        C      1695.0