Search code examples
rloopsvariablesregressionpurrr

How to run regression with multiple dependent variables and compare these models?


Below, I manage to run regression with multiple dependent variables. How can I assign the results of these estimations to a model object inside the loop and then compare these models based on AIC/BIC measures using for instance rcompanion::compareLM()?

depvar <- c("qsec", "hp")
indepvar <- paste(c("mpg", "wt", "am", "disp"), collapse = " + ")

for(i in 1:length(depvar)){
  est_model <- lm(paste(depvar[i],"~", indepvar), data=mtcars)
  res <- summary(est_model)
  print(list(depvar[i], res))
}

Solution

  • models <- list()
    for(i in 1:length(depvar)){
      est_model <- lm(paste(depvar[i],"~", indepvar), data=mtcars)
      models[[i]] <- est_model
    }
    
    names(models) <- depvar
    sapply(models, AIC)
    #     qsec        hp 
    # 97.38901 323.30591 
    
    library(purrr)
    library(broom)
    map_dfr(models, glance, .id = "depvar")
    # # A tibble: 2 × 13
    #   depvar r.squared adj.r.sq…¹ sigma stati…² p.value    df logLik   AIC   BIC devia…³ df.re…⁴  nobs
    #   <chr>      <dbl>      <dbl> <dbl>   <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>   <dbl>   <int> <int>
    # 1 qsec       0.727      0.687  1.00    18.0 2.63e-7     4  -42.7  97.4  106.    27.0      27    32
    # 2 hp         0.784      0.752 34.1     24.5 1.19e-8     4 -156.  323.   332. 31450.       27    32
    # # … with abbreviated variable names ¹​adj.r.squared, ²​statistic, ³​deviance, ⁴​df.residual
    

    Though I should warn that AIC and BIC aren't really appropriate for comparing models with different responses.