Search code examples
rrolling-computationrollapply

Apply rolling function which generates a list of data.frames (or single rbinded data.frame)


The two most commonly used functions for rolling functions (that I'm aware of) are zoo::rollapply and data.table::frollapply().

However, neither seems capable of running a function which generates a data.frame for each step and then returning them either in a list or as a single rbind-ed data.frame.

As a trivial example, if I have a function which simply returns a trivial data.frame and I call it with a rolling function, I'd expect to get:

f <- function(x) {
  data.frame(a = LETTERS[x], b = x)
}

# will be called twice, with inputs 1:2 and 2:3.
myrollapply(1:3, n = 2, FUN = f)
#> [[1]]
#>   a b
#> 1 A 1
#> 2 B 2
#>
#> [[2]]
#>   a b
#> 1 B 2
#> 2 C 3
#>
#> -- OR --
#> 
#>   a b
#> 1 A 1
#> 2 B 2
#> 3 B 2
#> 4 C 3

(for those who're interested, my real use-case rolls through a vector (or list, if necessary) of dates and then calls an external API which returns tabular data. Due to a restriction of the API, I can't do everything at once and must make multiple calls to the API to get what I need. My end goal would be to collate all the data.frames I get from the API into a single rbind-ed data.frame.)

This seems impossible with zoo and data.table:

zoo seems to not allow lists as zoo objects, which impedes their use as input or output. And if the function returns a naked data.frame, the output seems to do an rbind of transposed versions of the individual data.frames (also, the output is a matrix, not a data.frame, which isn't acceptable).

zoo::rollapply(c(1, 2, 3),
               width = 2,
               FUN = function(x) {data.frame(a = 1:3)})
#>      a1 a2 a3
#> [1,]  1  2  3
#> [2,]  1  2  3

zoo::rollapply(data.frame(a = c(1, 2, 3)),
               width = 2,
               FUN = function(x) {data.frame(a = 1:3)})
#>      a
#> [1,] 1
#> [2,] 1
#> [3,] 2
#> [4,] 2
#> [5,] 3
#> [6,] 3

zoo::rollapply(c(1, 2, 3),
               width = 2,
               FUN = function(x) {list(data.frame(a = 1:3))})
#> Error in zoo(do.call("c", dat), index(data)[ix], attr(data, "frequency")) : 
#>   “x” : attempt to define invalid zoo object

zoo::rollapply(list(1, 2, 3),
               width = 2,
               FUN = function(x) {data.frame(a = 1:3)})
#> Error in zoo(data) : “x” : attempt to define invalid zoo object

With data.table::frollapply, the problem is easier to understand: the return value must be numeric (or castable to numeric).

data.table::frollapply(c(1, 2, 3), n = 2, FUN = function(x) {data.frame(a = 1:3)})
#> Error in data.table::frollapply(c(1, 2, 3), n = 2, FUN = function(x) { : 
#>   frollapply: results from provided FUN are not of type double

Is there a package or method which can handle this particular case? I'm currently doing it by hand with a for-loop, but suspect there may be a better, more R-like solution.


Solution

  • 1) Run rollapply on the indexes and then use apply.

    library(zoo)
    
    f <- function(x) data.frame(a = LETTERS[x], b = x)
    ii <- rollapply(1:3, 2, c)
    apply(ii, 1, f)
    

    giving:

    [[1]]
      a b
    1 A 1
    2 B 2
    
    [[2]]
      a b
    1 B 2
    2 C 3
    

    2) This also works:

    L <- list()
    junk <- rollapply(1:3, 2, function(x, i = x[1]) L[[i]] <<- f(x))
    L
    

    giving:

    [[1]]
      a b
    1 A 1
    2 B 2
    
    [[2]]
      a b
    1 B 2
    2 C 3
    

    3) Another approach is to compute on the language.

    s <- rollapply(1:3, 2, function(x) sprintf("f(c(%s))", toString(x)))
    lapply(s, function(x) eval(parse(text = x)))
    

    giving:

    [[1]]
      a b
    1 A 1
    2 B 2
    
    [[2]]
      a b
    1 B 2
    2 C 3