Search code examples
rgetlagzoo

Create a new zoo object based on a vector of variable names


Set seed and variables. Please assume all everything is this section is given and unalterable.

library(zoo)
set.seed(123)
a <- zoo(rnorm(10), order.by = as.Date(50:60))
b <- zoo(rnorm(10), order.by = as.Date(50:60))
c <- zoo(rnorm(10), order.by = as.Date(50:60))
lags <- c(1,3,1)
variables <- c("a","c","b")

I want to create an adequate zoo object that picks variables from the list "variables" in that order and applies lags from "lags". This is my desired output (including column names):

                   a.l1       c.l3        b.l1
20/02/1970           NA         NA          NA
21/02/1970  -0.56047565         NA   1.2240818
22/02/1970  -0.23017749         NA   0.3598138
23/02/1970   1.55870831 -1.0678237   0.4007715
24/02/1970   0.07050839 -0.2179749   0.1106827
25/02/1970   0.12928774 -1.0260044  -0.5558411
26/02/1970   1.71506499 -0.7288912   1.7869131
27/02/1970   0.46091621 -0.6250393   0.4978505
28/02/1970  -1.26506123 -1.6866933  -1.9666172
01/03/1970  -0.68685285   0.837787   0.7013559
02/03/1970  -0.44566197  0.1533731  -0.4727914

This is one the closest I could get to, but it doesn't work. The problem is somewhere in the "get" function I think.

lag(as.zoo(mget(variables)),lags-1)

Many thanks


Solution

  • mget(variables) is actually returning a list with one element per variable in variables, containing the vector of values in that variable.

    You can get it into a structure usable by lag() by binding the elements of the list into columns using do.call("cbind", mget(variables)). As far as I know, it's actually unnecessary to wrap this in as.zoo().

    To get the proper lags, you need -lags rather than lags-1.

    Putting this together, you get:

    lagged <- lag(do.call("cbind", mget(variables)), -lags)
    

    That contains the 1 and 3 lags for each variable, so you'll have to do a bit of post-processing to get the format you want. The following should do it:

    lagged <- lagged[, c("a.lag-1", "c.lag-3", "b.lag-1")]
    colnames(lagged) <- c("a.l1", "c.l3", "b.l1")
    

    Note though that since at 1970-02-20 all lags are NA, this row is excluded from the output.