Search code examples
rloopspurrr

Simplifying a for loop with purrr::map or any of its variants in R


I am working on some multiple regression problem that has different outcomes and different predictors.

I have a written a script that takes the outcomes and predictors and iterates over a function in a for loop.

What I am looking for is something like purrr::map or any of its variants to replace the for loop.

For illustrative purposes, I sharing some synthetic data that mimics the original data along with the script I have written.

# create synthetic data
df <- data.frame(y1=sample(rnorm(n=50, mean=3.25, sd=.25), replace=TRUE),
                 y2=sample(rnorm(n=50, mean=3.75, sd=.48), replace=TRUE),
                 x1=sample(rnorm(n=50, mean=4.28, sd=.32), replace=TRUE),
                 x2=sample(rnorm(n=50, mean=3.75, sd=.64), replace=TRUE),
                 x3=sample(rnorm(n=50, mean=3.99, sd=.55), replace=TRUE),
                 x4=sample(runif(n=50, min=1L, max=2L), replace=TRUE),
                 wgt=sample(runif(n=50, min=.20, max=.75), replace=TRUE))


# regression function
reg_func <- function(y, ...){
  x = sapply(substitute(...()), deparse)
  f = reformulate(termlabels=x, response=y)
  model = eval(lm(f, data=df, weights=wgt, na.action=na.omit))
  an0va = anova(model)
  jt = jtools::summ(model, confint=TRUE, ci.width=0.95, robust=FALSE, vifs=TRUE)
  list(outcome=y, model_summary=summary(model), a0nova=an0va, jtools_summumary=jt)
}

# select the dvs and set names
dvs <- names(df)[1:2]
dvs <- purrr::set_names(dvs)

I am looking for something that simplifies the looping part of my script below

# loop dependent variables and store the results
reg_out = list()
reg_out2 = list()
for (i in seq_along(dvs)){
  reg_out[[i]] = reg_func(y=dvs[i], x1, x2)
  reg_out2[[i]] = reg_func(y=dvs[i], x3, x4)
}
reg_out
reg_out2

Solution

  • Just use map() on the dvs vector itself and output a named list where you can access each independent variable and then the models you want.

    library(purrr)
    library(jtools)
    
    reg_out <- map(
      dvs,
      ~ list(
        m1 = reg_func(y = .x, x1, x2),
        m2 = reg_func(y = .x, x3, x4)
      )
    )
    
    reg_out$y1$m1
    #> $outcome
    #> [1] "y1"
    #> 
    #> $model_summary
    #> 
    #> Call:
    #> lm(formula = f, data = df, weights = wgt, na.action = na.omit)
    #> 
    #> Weighted Residuals:
    #>      Min       1Q   Median       3Q      Max 
    #> -0.45116 -0.15721  0.03369  0.11011  0.42508 
    #> 
    #> Coefficients:
    #>             Estimate Std. Error t value Pr(>|t|)    
    #> (Intercept)  3.16364    0.73468   4.306 8.39e-05 ***
    #> x1           0.04892    0.15780   0.310    0.758    
    #> x2          -0.02670    0.06838  -0.390    0.698    
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    #> 
    #> Residual standard error: 0.2102 on 47 degrees of freedom
    #> Multiple R-squared:  0.00565,    Adjusted R-squared:  -0.03666 
    #> F-statistic: 0.1335 on 2 and 47 DF,  p-value: 0.8753
    #> 
    #> 
    #> $a0nova
    #> Analysis of Variance Table
    #> 
    #> Response: y1
    #>           Df  Sum Sq  Mean Sq F value Pr(>F)
    #> x1         1 0.00506 0.005064  0.1146 0.7365
    #> x2         1 0.00674 0.006736  0.1524 0.6980
    #> Residuals 47 2.07680 0.044187               
    #> 
    #> $jtools_summumary
    #> MODEL INFO:
    #> Observations: 50
    #> Dependent Variable: y1
    #> Type: OLS linear regression 
    #> 
    #> MODEL FIT:
    #> F(2,47) = 0.13, p = 0.88
    #> R² = 0.01
    #> Adj. R² = -0.04 
    #> 
    #> Standard errors: OLS
    #> ----------------------------------------------------------------
    #>                      Est.    2.5%   97.5%   t val.      p    VIF
    #> ----------------- ------- ------- ------- -------- ------ ------
    #> (Intercept)          3.16    1.69    4.64     4.31   0.00       
    #> x1                   0.05   -0.27    0.37     0.31   0.76   1.01
    #> x2                  -0.03   -0.16    0.11    -0.39   0.70   1.01
    #> ----------------------------------------------------------------