Search code examples
rrollapply

How to feed in the arguments of a function used in rollapplyr FUN


Let's say I have the following data:

input <- data.frame(id=rep(c('A', 'B'), c(10, 10)),year=rep(1:10, 2),
                  y=c(rnorm(10), rnorm(10)),x1 = c(rnorm(10),rnorm(10)))

I want to use rollapplyr to do a rolling regression. First I define a beta function :

# Simple Regression
beta <- function(x, indepvar, depvar) {
  a <- coef(lm(formula = indepvar ~ depvar, data = x))
  return(a)
}

Now I want to use this function in a rollapplyr setting. (I know that I can define the function within the rollapplyr, but I want to understand the concept.)

rollapplyr(input, width = 6,
              FUN = beta, x = input, indepvar = y, depvar = x1,
              by.column = FALSE)

I am trying to feed the arguments of beta function by defining input, indepvar and depvar in the code above. But I am getting this error:

Error in FUN(data[posns, ], ...) : unused argument (data[posns, ])

There is a question here: unused arguments but I don't understand what argument I am not using. what does this error mean in my context? Why am I getting it?


Solution

  • I think there are a few issues here. Let me walk you through the most critical ones:

    # Simple Regression
    beta <- function(x, indepvar, depvar) {
      a <- coef(lm(formula = indepvar ~ depvar, data = x))
      return(a)
    }
    

    The way you have written your function beta means that you have to input the data x, the indepvar column and the depvar column. But this would not work for lm because what indepvar and depvar contains are fed in, instead of the variable names. For instance, the following would not work:

    beta(input, y, x1)
    

    Error in eval(expr, envir, enclos) : object 'y' not found

    This is because y and x1 do not exist outside of input. Your rollapplyr has the same issue. One way to get around this is to write:

    beta <- function(indepvar, depvar) {
      a <- coef(lm(indepvar ~ depvar))
      return(a)
    }
    

    And explicitly input the columns like:

    # > beta(input[,3],input[,4])
    # (Intercept)      depvar 
    #   0.1308993   0.2373399
    

    Now this works:

    rollapplyr(input[3:4], width = 6,
               FUN = function(x) beta(x[,1], x[,2]),
               by.column = FALSE)
    
    #      (Intercept)        depvar
    # [1,] -0.04987909  0.6433585022
    # [2,] -0.23739671  0.7527017129
    # [3,] -0.40483456  0.5833452315
    # [4,] -0.28191172  0.6660916836
    # [5,]  0.02886934  0.5334114615
    # [6,]  0.17284232  0.8126499211
    # [7,]  0.01236415  0.3194661428
    # [8,]  0.48156300 -0.1532216150
    # [9,]  0.75965765 -0.1993015431
    # [10,]  0.80509109 -0.1822009137
    # [11,]  0.55055694 -0.0005880675
    # [12,]  0.53963291 -0.0262970723
    # [13,]  0.46509011  0.0570725348
    # [14,]  0.33227459  0.1598345855
    # [15,] -0.20316429  0.2757045612
    

    If you want to be able to call the columns by name, you can write your beta function as:

    library(zoo)
    beta <- function(x, indepvar, depvar) {
      a <- coef(lm(as.formula(paste(indepvar, "~", depvar)), 
                   data = x))
      return(a)
    }
    
    rollapplyr(input[3:4], width = 6,
               FUN = function(x) beta(as.data.frame(x), "y", "x1"),
               by.column = FALSE)
    
    #      (Intercept)            x1
    # [1,] -0.04987909  0.6433585022
    # [2,] -0.23739671  0.7527017129
    # [3,] -0.40483456  0.5833452315
    # [4,] -0.28191172  0.6660916836
    # [5,]  0.02886934  0.5334114615
    # [6,]  0.17284232  0.8126499211
    # [7,]  0.01236415  0.3194661428
    # [8,]  0.48156300 -0.1532216150
    # [9,]  0.75965765 -0.1993015431
    # [10,]  0.80509109 -0.1822009137
    # [11,]  0.55055694 -0.0005880675
    # [12,]  0.53963291 -0.0262970723
    # [13,]  0.46509011  0.0570725348
    # [14,]  0.33227459  0.1598345855
    # [15,] -0.20316429  0.2757045612
    

    Notice I have to supply input[3:4] instead of just input to rollapplyr because apparently rollapplyr only takes matrix as input. If input has mixed types, rollapplyr coerce it to a matrix of characters, which is not desirable. So I have to both supply the numeric only columns and coerce it back to data.frame with as.data.frame for lm to work.

    Here are two link that discuss this issue with rollapplyr:

    Is there a function like rollapply for data.frame

    Can `ddply` (or similar) do a sliding window?