Search code examples
rdataframedata.tabledifftime

How to make a timespan column based on a time stamp column?


I have a datatable with timestamps in column 1 and events in column 2. The timestamps have the format Y-m-d H:M:OS3 (e.g. "2019-09-26 07:29:22,778").

I would like to add a new column with time-span values of the difference between timestamp 2 to timestamp 1, then 3 to 1 etc. E.g.:

timestamp                  event           diff in sec
2019-09-26 07:29:22,778    X                   
2019-09-26 07:29:23,918    Y               1.140
2019-09-26 07:29:25,118    Z               2.340
.
.

Solution

  • In base:

    dt1$timediff <- cumsum(c(0, difftime(tail(dt1$timestamp,-1), head(dt1$timestamp,-1))))
    

    or using data.table:

    library(data.table)
    
    dt1[ , timediff := cumsum(c(0, diff(as.numeric(timestamp))))][]
    #>                  timestamp event timediff
    #> 1: 2019-09-26 07:29:22.778     X     0.00
    #> 2: 2019-09-26 07:29:23.917     Y     1.14
    #> 3: 2019-09-26 07:29:25.118     Z     2.34
    

    Another dplyr solution base off of akrun's answer:

    library(dplyr)
    dt1 %>%
      mutate(difftime = difftime(timestamp, timestamp[1], unit = 'sec'))
    

    Data: N.B: I am using data.table to read the data.

    fread(text="date time  event
     2019-09-26 07:29:22.778    X                   
     2019-09-26 07:29:23.918    Y               
     2019-09-26 07:29:25.118    Z") -> dt1
    
    dt1$timestamp <- as.POSIXct(paste(dt1$date, dt1$time), format="%Y-%m-%d %H:%M:%OS")
    
    dt1 <- dt1[,4:3]