Search code examples
rcsvdatetimediffdate-arithmetic

How to subtract datetimes and store them in a separate column?


i am working with csv file and i have a column with name "statistics_lastLocatedTime" as shown in csv file image i would like to subtract second row of "statistics_lastLocatedTime" from first row; third row from second row and so on till the last row and then store all these differences in a separate column and then combine this column to the other related columns as shown in the code given below:

##select related features
data <- read.csv("D:/smart tech/store/2016-10-11.csv")
(columns <- data[with(data, macAddress == "7c:11:be:ce:df:1d" ),
    c(2,10,11,38,39,48,50) ])
write.csv(columns, file = "updated.csv", row.names = FALSE)

## take time difference 
date_data <- read.csv("D:/R/data/updated.csv")
(dates <- date_data[1:40, c(2)])
NROW(dates)
for (i in 1:NROW(dates)) {
  j <- i+1
  r1 <- strptime(paste(dates[i]),"%Y-%m-%d %H:%M:%S")
  r2 <- strptime(paste(dates[j]),"%Y-%m-%d %H:%M:%S")
  diff <- as.numeric(difftime(r1,r2))
  print (diff)
}

## combine time difference with other related columns
combine <- cbind(columns, diff)
combine

now the problem is that i am able to get the difference of rows but not able to store these values as a column and then combine that column with other related columns. please help me. thanks in advance.


Solution

  • This is a four-liner:

    1. Define a custom class 'myDate', and a converter function for your custom datetime, as per Specify custom Date format for colClasses argument in read.table/read.csv
    2. Read in the datetimes as actual datetimes; no need to repeatedly convert later.
    3. Simply use the vectorized diff operator on your date column (it sees their type, and automatically dispatches a diff function for POSIXct Dates). No need for for-loops:

    .

    setClass('myDate') # this is not strictly necessary
    setAs('character','myDate', function(from) { 
      as.POSIXct(from, format='%d-%m-%y %H:%S', tz='UTC') # or whatever timezone
    })
    
    data <- read.csv("D:/smart tech/store/2016-10-11.csv",
                colClasses=c('character','myDate','myDate','numeric','numeric','integer','factor'))
    # ...
    data$date_diff <- c(NA, diff(data$statistics_lastLocatedTime))
    

    Note that diff() produces a result of length one shorter than vector that we diff'ed. Hence we have to pad it (e.g. with a leading NA, or whatever you want).