Search code examples
rregressionapplysapplydo.call

Create dataframe with all coefficients from rolling regression


I want to create two dataframes that store all intercept and Beta values respectively.

I want each column of the first dataframe to be labelled as "Y1 Intercept", "Y2 Intercept"... "Y(n) Intercept" and then similarly for the second dataframe "Y1 Coef 1", "Y2 Coef 1 and so on and so forth. The naming of columns is important to keep track given that in practice I will implement this code across a much larger dataframe with multiple X coefficients.

Ideally I would also create a third dataframe to store all R-squared values, but hopefully I can figure this point out after help with the above.

library(roll)

set.seed(1)
x_var <- data.frame(X1 = rnorm(50))
y_var <- data.frame(y1 = rnorm(50),
                    y2 = rnorm(50),
                    y3 = rnorm(50),
                    y4 = rnorm(50))

#Regression 
model_output <- roll_lm(x = as.matrix(x_var),
        y = as.matrix(y_var),
        width = 12,
        min_obs = 12,
        na_restore = TRUE)


model_output$coefficients 

The above would give me the output I want, but i'm not sure how to combine all intercept values in a single dataframe and then all Beta values in another with correct column labelling.


Solution

  • Here's a fairly simple tidyverse implementation. The only complicated thing is the column renaming, which uses regex groups:

    library(tidyverse)
    model_output <- as.data.frame(model_output)
    
    rsq_df <- model_output %>%
                select(starts_with("r.squared")) %>%
                rename_with(~str_remove(., "r.squared."))
    
    intercept_df <- model_output %>%
        select(starts_with("coefficients")) %>%
        select(ends_with("Intercept.")) %>%
        rename_with(~ str_replace_all(., "coefficients.y(\\d+)..Intercept.", "Y\\1 Intercept"))
    
    coef_df <- model_output %>%
        select(starts_with("coefficients")) %>%
        select(!ends_with("Intercept.")) %>%
        rename_with(~ str_replace_all(., "coefficients.y(\\d+).X(\\d+)", "Y\\1 Coef \\2"))