Search code examples
rfor-loopregressionsapply

Linear regression - moving independent and dependent columns for each run


I am struggling with solving this problem:

Let's say I have data("mtcars") dataframe. I would like to perform regression for two columns of this dataframe (e.g. mtcars$mpg versus mtcars$wt), store R-squared of each regression and residuals of each regression into two different dataframes, then move the columns to the right for both independent and dependent variables (e.g. mtcars$cyl versus mtcars$qsec). I will need to repeat this until I reach the last independent variable column (in this mtcars database it would be drat column).

Any help would be appreciated!


Solution

  • Use rollapply over a column index. List(c(-5, 0)) means use offsets -5 and 0 on each iteration.

    library(zoo)
    
    resids <- t(rollapply(1:ncol(mtcars), list(c(-5, 0)), 
      function(ix) resid(lm(mtcars[, ix]))))
    
    rsquareds <- rollapply(1:ncol(mtcars), list(c(-5, 0)), 
      function(ix) summary(lm(mtcars[, ix]))$r.squared)
    

    If you meant to reverse which are the dependent and independent variables then use list(c(0, -5)) instead.