Search code examples
rpurrr

How to generate residual plots when running a series of regressions with purrr::map?


I am running a long series of regression models using purrr and broom. I'd like to generate plots of the residuals for each of them, and ideally save them in a folder somewhere.

My code for the regressions (in simplified form) looks like this:

library(tidyverse)

set.seed(1)
df <- data.frame(x1 = rnorm(100, 0, 1), 
                 x2 = rnorm(100, 0, 1),
                 y1 = rnorm(100, 0, 1), 
                 y2 = rnorm(100, 0, 1), 
                 t2 = sample(0:1, 100, replace = TRUE), 
                 t1 = sample(0:1, 100, replace = TRUE))

models <- data.frame(outcome = c("y1", "y2", "y2"),
                     treatment = c("t1", "t1", "t2"),
                     covariates = c("x1", "x1", "x1 + x2"))

regression_output <- models %>% 
  mutate(formula = paste(outcome, "~", treatment, "+", covariates), 
         fit = map(formula, ~ tidy(lm(.x, df)))) %>% 
  unnest(fit) %>% 
  filter(term == treatment)

I've written a simple function to save a QQ plot:

save_residual_plot <- function(formula, fit) {
  jpeg(paste0("QQ plot - ", formula, ".jpg"))
  plot(fit, which = 2)
  dev.off()
}

But I have not found a way to run this within the nested dataframe/map framework. Is it possible to do this? Or is there a better way to check the residuals when running a whole series of regression models?


Solution

  • Here is a variation using map2:

    library(tidyverse)
    
    regression_output <- models %>% 
      mutate(formula_text = paste(outcome, "~", treatment, "+", covariates), 
             fit = map(formula_text, ~ lm(as.formula(.x), df))) 
    
    
    save_residual_plot <- function(formula_text, fit) {
      jpeg(paste0("QQ plot - ", formula_text, ".png"))
      plot(fit, which = 2)
      dev.off()
    }
    
    map2(regression_output$formula_text, regression_output$fit, save_residual_plot)