Search code examples
rzoolmrollapply

Using rollapply and lm over multiple columns of data


I have a data frame similar to the following with a total of 500 columns:

Probes <- data.frame(Days=seq(0.01, 4.91, 0.01), B1=5:495,B2=-100:390, B3=10:500,B4=-200:290)

I would like to calculate a rolling window linear regression where my window size is 12 data points and each sequential regression is separated by 6 data points. For each regression, "Days" will always be the x component of the model, and the y's would be each of the other columns (B1, followed by B2, B3, etc). I would then like to save the co-efficients as a dataframe with the existing column titles (B1, B2, etc).

I think my code is close, but is not quite working. I used rollapply from the zoo library.

slopedata<-rollapply(zoo(Probes), width=12, function(Probes) { 
 coef(lm(formula=y~Probes$Days, data = Probes))[2]
 }, by = 6, by.column=TRUE, align="right")

If possible, I would also like to have the "xmins" saved to a vector to add to the dataframe. This would mean the smallest x value used in each regression (basically it would be every 6 numbers in the "Days" column.) Thanks for your help.


Solution

  • 1) Define a zoo object z whose data contains Probes and whose index is taken from the first column of Probes, i.e. Days. Noting that lm allows y to be a matrix define a coefs function which computes the regression coefficients. Finally rollapply over z. Note that the index of the returned object gives xmin.

    library(zoo)
    
    z <- zoo(Probes, Probes[[1]])
    
    coefs <- function(z) c(unlist(as.data.frame(coef(lm(z[,-1] ~ z[,1])))))
    rz <- rollapply(z, 12, by = 6, coefs, by.column = FALSE, align = "left")
    

    giving:

    > head(rz)
         B11 B12  B21 B22 B31 B32  B41 B42
    0.01   4 100 -101 100   9 100 -201 100
    0.07   4 100 -101 100   9 100 -201 100
    0.13   4 100 -101 100   9 100 -201 100
    0.19   4 100 -101 100   9 100 -201 100
    0.25   4 100 -101 100   9 100 -201 100
    0.31   4 100 -101 100   9 100 -201 100
    

    Note that DF <- fortify.zoo(rz) could be used if you needed a data frame representation of rz.

    2) An alternative somewhat similar approch would be to rollaplly over the row numbers:

    library(zoo)
    y <- as.matrix(Probes[-1])
    Days <- Probes$Days
    n <- nrow(Probes)
    coefs <- function(ix) c(unlist(as.data.frame(coef(lm(y ~ Days, subset = ix)))), 
          xmins = Days[ix][1])
    r <- rollapply(1:n, 12, by = 6, coefs)