Search code examples
rposixctdifftime

Is there a faster alternative to difftime function in R?


I have a time series dataset with around 120,000 rows, which I am storing as a data frame. Most of the data is at 15 minute interval, but there is some monthly data also. I want to keep only the 15 minute data and eliminate the data at monthly interval. So I am calculating the difference between consecutive timestamp and then eliminating everything not equal to 15 minutes (900 seconds). My timestamp column name is 'datetime'. I am using the following to calculate the time interval-

site_data[1:nrow(site_data)-1,"Interval"] <- as.numeric(difftime(site_data[2:nrow(site_data),"DateTime"],
                                                                 site_data[1:nrow(site_data)-1,"DateTime"]))

But this code is taking too long to run. Is there a faster alternative to difftime? The timestamp column is POSIXct type date-time. Thank you.


Solution

  • Just use diff(as.numeric(timeCol)):

    R> library(microbenchmark)
    R> times <- Sys.time() + 1:1e5
    R> microbenchmark(diff(times), diff(as.numeric(times)))
    Unit: microseconds
                        expr      min      lq    mean  median      uq     max neval cld
                 diff(times) 1653.999 2153.82 8871.00 2407.66 5313.88 41223.4   100   b
     diff(as.numeric(times))  774.058 1215.35 3910.26 1456.82 1846.53 35622.2   100  a 
    R> 
    

    Not a huge difference but about a factor of two in the mean.