Search code examples
rdatezoochron

R chron %in% comparison only recognizes every second date


I am using zoo and chron packages in R to read and transform data. At one point I need to select a part of a chron-indexed zoo object which corresponds to another chron object. Unfortunately, using %in% operator I only get part of the corresponding dates. Here is a MWE that reproduces the error:

library(chron)
library(zoo)
chron1 <- seq(chron("2013-01-01","00:00:00", format=c(dates="y-m-d",times="h:m:s")),
              chron("2013-01-01","03:10:00", format=c(dates="y-m-d",times="h:m:s")),by=1./1440.)
x1 <- runif(200)
z1 <- zoo(x1,chron1)
chron10 <- trunc(chron1, "00:10:00")
x10 <- aggregate(z1,chron10,FUN=sum)
which(index(x10) %in% chron1)

The (unexpected) output is:

[1]  1  3  5  7  9 10 12 14 16 18 19

Solution

  • chron objects are floating point so there can be slight differences in what appears to be the same datetime depending on how they were calculated. format them and compare those:

    which(format(index(x10)) %in% format(chron1))
    ## [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    

    This also works as trunc uses an eps value to ensure that inputs slightly less than one minute are not truncated down a further minute. See ?trunc.times

    which(trunc(index(x10), "minutes") %in% trunc(chron1, "minutes"))
    ##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    

    Also see R FAQ 7.31