Search code examples
rxts

BUG in the r xts package's to.period function?


I inherited some R code that analyses simulation results. At one point, that code calls the xts package's to.monthly function with indexAt = 'yearmon' to summarize some values in a zoo.

That code normally runs without issue. Recently, however, when analysing simulations over much older data, the call to to.monthly generated some disturbing Warning messages like this:

Warning in zoo(xx, order.by = index(x), ...) :
  some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique

I culled my data down to the minimum size that still exhibits this Warning. Start with this R code:

library(xts)

z = structure(c(-1062503.35419463, -1080996.55425821, -1099783.92018741, 
-1122831.06978888, -1138804.79976585, -1158620.33101501, -1163717.44859603, 
-1183250.17288897, -1212428.97863421, -1234981.23171341, -1253605.89670471, 
-1269885.84780747, -1272023.98376509, -1284471.17954946, -1313114.61914572, 
-1334861.551294, -1349971.87378146, -1360596.77251109, -1363047.71977556, 
-1383840.30131117, -1407963.97518998, -1427010.7195352, -1451908.36211767, 
-1464563.94519573, -1470017.67402451, -1503642.02732151, -1529231.67395429, 
-1560593.79655716, -1582052.24505653, -1595391.99583389), index = structure(c(1111985820, 
1112072340, 1112158740, 1112245140, 1112331540, 1112392740, 1112587140, 
1112673540, 1112759880, 1112846340, 1112932200, 1112993940, 1113191940, 
1113278340, 1113364560, 1113451080, 1113537540, 1113598740, 1113796560, 
1113883140, 1113969540, 1114055940, 1114142220, 1114203540, 1114401480, 
1114487940, 1114574280, 1114660740, 1114747080, 1114808340), class = c("POSIXct", 
"POSIXt")), class = "zoo")

class(z)
head(z)
tail(z)

Then execute this call to to.monthly:

to.monthly(z, indexAt = 'yearmon', name = "Monthly")

On my machine that generates this output:

Warning in zoo(xx, order.by = index(x), ...) :
  some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
Warning in zoo(xx, order.by = index(x), ...) :
  some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique
         Monthly.Open Monthly.High Monthly.Low Monthly.Close
Apr 2005     -1062503     -1062503    -1138805      -1138805
Apr 2005     -1158620     -1158620    -1595392      -1595392

Note the Warning messages, followed by the result of to.monthly, which is a zoo that has the duplicate position of "Apr 2005".

I spent some time executing the code in to.monthly line by line, and determined that the bug actually happens inside to.monthly's call to to.period.

In particular, I found that the xx local variable inside to.period is initially calculated correctly, but after the line

indexClass(xx) <- indexAt

is executed that is when the positions of xx become non-unique.

That behavior sure looks like a bug in the xts package's to.period function to me.

I would love to hear from someone who knows how to.monthly/to.period/yearmon really works either confirm that this is a bug, or explain to me why it is not and give me a work around.

I found this possibly related report on the xts github page (which I do not fully understand).

Concerning my machine:

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

...   

other attached packages:
...
xts_0.10-0
zoo_1.8-0

When I startup Rgui, I see this Warning message about xts:

Warning: package ‘xts’ was built under R version 3.4.2

Solution

  • This looks like a bug, unrelated to #158. The problem is that the index of z is POSIXct in your local timezone. You aggregate to monthly, which doesn't have a timezone (so xts sets the timezone attribute to "UTC").

    But the change in timezone occurs on the POSIXct index, which changes the local time before the index is converted to "yearmon". So, depending on your local timezone's offset from UTC, this may convert the first (last) observation in a month into the last (first) observation of the prior (next) month.

    To illustrate:

    Sys.setenv(TZ = "America/Chicago")
    debugonce(xts:::`indexClass<-.xts`)
    to.monthly(z, indexAt="yearmon", name="monthly")
    # <snip>
    # Browse[2]> 
    # debug: attr(attr(x, "index"), "tzone") <- "UTC"
    # Browse[2]> print(x)  # When timezone is "America/Chicago"
    #                     monthly.Open monthly.High monthly.Low monthly.Close
    # 2005-03-31 22:59:00     -1062503     -1062503    -1138805      -1138805
    # 2005-04-29 15:59:00     -1158620     -1158620    -1595392      -1595392
    # Browse[2]> 
    # debug: attr(attr(x, "index"), "tclass") <- value
    # Browse[2]> print(x)  # When timezone is "UTC"
    #                     monthly.Open monthly.High monthly.Low monthly.Close
    # 2005-04-01 04:59:00     -1062503     -1062503    -1138805      -1138805
    # 2005-04-29 20:59:00     -1158620     -1158620    -1595392      -1595392
    # Warning message:
    # timezone of object (UTC) is different than current timezone ().
    

    You can see that the call to attr(attr(x, "index"), "tzone") <- "UTC" pushed the last observation in March into the first day of April (note that the debugger lists the next call it will evaluate above my calls to print(x)).

    Thanks for narrowing it down to the indexClass<- call. That made it a lot easier for me to debug!