Search code examples
pythonpandastimeunix-timestamptimedelta

From unix timestamps to relative date based on a condition from another column in pandas


I have a column of dates in unix timestamps and i need to convert them into relative dates from the starting activity.

The final output should be the column D, which expresses the relative time from the activity which has index = 1, in particular the relative time has always to refer to the first activity (index=1).

A          index  timestamp     D
activity1  1      1.612946e+09  0
activity2  2      1.614255e+09  80 hours
activity3  1      1.612181e+09  0
activity4  2      1.613045e+09  50 hours
activity5  3      1.637668e+09  430 hours

Any idea?


Solution

  • Use to_datetime with unit='s' and then create groups starting by index equal 1 and get first value, last subtract and convert to hours:

    df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
    
    s = df.groupby(df['index'].eq(1).cumsum())['timestamp'].transform('first')
    df['D1'] = df['timestamp'].sub(s).dt.total_seconds().div(3600)
    print (df)
               A  index           timestamp          D           D1
    0  activity1      1 2021-02-10 08:33:20          0     0.000000
    1  activity2      2 2021-02-25 12:10:00   80 hours   363.611111
    2  activity3      1 2021-02-01 12:03:20          0     0.000000
    3  activity4      2 2021-02-11 12:03:20   50 hours   240.000000
    4  activity5      3 2021-11-23 11:46:40  430 hours  7079.722222