Loops through variable lists - embedded loops

I am a long-time STATA user finally trying to master R so that I can improve my graphics. In the following code, I fit a GAM to an outcome variable y1 using 5 exposure variables x1 - x5 and then plot the predictions for x1.

What I want to do is have two loops, one embedded inside the other so that the first loop iterates over the 5 outcomes, fitting the GAM for each and then, in the embedded second loop, it iterates over the 5 exposures, plotting the predictions for each. The result would be twenty five plots of 5 variables from each of five GAMs. In the real database, the variables aren't numbered, so it has to loop over the variable names as strings.

y1.gam <- mgcv::gam(y1~s(x1,bs="cr",fx=TRUE)+
                   s(x2,bs="cr",fx=TRUE)+
                   s(x3,bs="cr",fx=TRUE)+
                   s(x4,bs="cr",fx=TRUE)+
                   s(x5,bs="cr",fx=TRUE)+
                  family = poisson(link = "log"),
                  data = data)
y1.x1.plot <- plotGAM(gamFit = y1.gam , smooth.cov = "x1", groupCovs = NULL,
       plotCI=TRUE, orderedAsFactor = FALSE)

If it helps, here's how it would go in STATA:

global outcome y1 y2 y3 y4 y5
global exposure x1 x2 x3 x4 x5

foreach v of varlist $outcome {
    gam `v’ $exposure, …
    foreach w of varlist $exposure{
         plot `w’…
    }
}

Hope you can help.

Thanks.

Josh

Solution

Try this. Used mtcars as example dataset, even if it's for sure not an appropriate example dataset for this kind of model. However, I hope it's sufficient to show the general approach. The key of the loops is to use substitute to set up a formula object for the estimation step. The result is a list containing the models, the plots, the formulas, ...

vars_outcome <- c("mpg", "disp")
vars_exposure <- c("hp", "qsec")

# Grid of outcome and exposure variables
vars_grid <- expand.grid(out = vars_outcome, exp = vars_exposure, stringsAsFactors = FALSE)
# Init list for formulas, models, plots
mods <- list(out = vars_grid$out, exp = vars_grid$exp, fmla = list(), mod = list(), mod = list())

for (i in seq_len(nrow(vars_grid))) {
  # Set up the formula
  mods$fmla[[i]] <- substitute(out ~ s(exp, bs="cr",fx=TRUE), list(out = as.name(mods$out[[i]]), exp = as.name(mods$exp[[i]])))
  # Estimate Model
  mods$mod[[i]] <- mgcv::gam(mods$fmla[[i]], family = poisson(link = "log"), data = mtcars)
  # Plot Model
  mods$plt[[i]] <- voxel::plotGAM(gamFit = mods$mod[[i]] , smooth.cov = mods$exp[[i]], groupCovs = NULL, plotCI=TRUE, orderedAsFactor = FALSE)

  # Create a "variable" containing the plot 
  assign(paste(mods$out[[i]], mods$exp[[i]], sep = "_"), mods$plt[[i]])

}

## Name the list with the plots
names(mods$plt) <- paste(mods$out, mods$exp, sep = "_")

mods$fmla[[1]]
#> mpg ~ s(hp, bs = "cr", fx = TRUE)
mods$mod[[1]]
#> 
#> Family: poisson 
#> Link function: log 
#> 
#> Formula:
#> mpg ~ s(hp, bs = "cr", fx = TRUE)
#> attr(,".Environment")
#> <environment: R_GlobalEnv>
#> 
#> Estimated degrees of freedom:
#> 9  total = 10 
#> 
#> UBRE score: -0.1518201
mods$plt[[1]]

^{Created on 2020-03-28 by the reprex package (v0.3.0)}