I am running a long series of regression models using purrr
and broom
. I'd like to generate plots of the residuals for each of them, and ideally save them in a folder somewhere.
My code for the regressions (in simplified form) looks like this:
library(tidyverse)
set.seed(1)
df <- data.frame(x1 = rnorm(100, 0, 1),
x2 = rnorm(100, 0, 1),
y1 = rnorm(100, 0, 1),
y2 = rnorm(100, 0, 1),
t2 = sample(0:1, 100, replace = TRUE),
t1 = sample(0:1, 100, replace = TRUE))
models <- data.frame(outcome = c("y1", "y2", "y2"),
treatment = c("t1", "t1", "t2"),
covariates = c("x1", "x1", "x1 + x2"))
regression_output <- models %>%
mutate(formula = paste(outcome, "~", treatment, "+", covariates),
fit = map(formula, ~ tidy(lm(.x, df)))) %>%
unnest(fit) %>%
filter(term == treatment)
I've written a simple function to save a QQ plot:
save_residual_plot <- function(formula, fit) {
jpeg(paste0("QQ plot - ", formula, ".jpg"))
plot(fit, which = 2)
dev.off()
}
But I have not found a way to run this within the nested dataframe/map framework. Is it possible to do this? Or is there a better way to check the residuals when running a whole series of regression models?
Here is a variation using map2
:
library(tidyverse)
regression_output <- models %>%
mutate(formula_text = paste(outcome, "~", treatment, "+", covariates),
fit = map(formula_text, ~ lm(as.formula(.x), df)))
save_residual_plot <- function(formula_text, fit) {
jpeg(paste0("QQ plot - ", formula_text, ".png"))
plot(fit, which = 2)
dev.off()
}
map2(regression_output$formula_text, regression_output$fit, save_residual_plot)