Creating an r dataframe of gam models

If I fit three different gam models as follows:

df <- data.frame(count = rpois(100,1),
                 pred1 = rnorm(100, 10, 1), 
                 pred2 = rnorm(100, 0, 1), 
                 pred3 = rnorm(100, 0, 1))

m1 <- gam(count ~ s(pred1),
             data = dat, 
             family = poisson(link="log"), 
             method = "REML", 
             select = TRUE)

m2 <- gam(count ~ s(pred2),
          data = dat, 
          family = poisson(link="log"), 
          method = "REML", 
          select = TRUE)

m3 <- gam(count ~ s(pred3),
          data = dat, 
          family = poisson(link="log"), 
          method = "REML", 
          select = TRUE)

And then try and put them into a single dataframe:

models <- data.frame(m = c(m1,m2,m3))

I get this error:

Error in[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : 
  cannot coerce class ‘"family"’ to a data.frame

Any ideas how to fix this? I want to create structure that I can loop over to make some predictions from.


  • As docs indicate, the return value of mgcv::gam is an object of gam class. This gamObject inherits from base R's class objects (lm and glm) and so includes many underlying elements that cannot be easily binded into the two dimensions of a data frame:

    Fitted Gam Object

    A fitted GAM object returned by function gam and of class "gam" inheriting from classes "glm" and "lm". Method functions anova, logLik, influence, plot, predict, print, residuals and summary exist for this class.

    Usually to retrieve estimates from these model objects, you would run summary to return a list of named elements such as coefficients, residuals, etc. From there, extract the needed components that can be either a vector, matrix, or list into data frames. Note: due to varying nature of lengths and types of underlying components, there is no simple method to extract all estimates of model to a data frame.

    You will have to ask yourself:

    • What specific estimates of model do I want in a data frame?

    • Do I keep all three model estimates in one data frame or use a list of data frames?

    • What indicator data (data, formula, etc.) to store to distinguish form others?

      StackOverflow R posts contain many examples of how to extract model estimates like coefficients into data frames.

    One implementation is to define a method to extract model estimates with input parameter being a formula which appears to be only difference between all three models.

    run_gam_models <- function(my_formula) {
          fit <- gam(my_formula,
                     data = dat, 
                     family = poisson(link="log"), 
                     method = "REML", 
                     select = TRUE)
          results <- summary(fit)
          df <- data.frame(results$coefficients, ...)
    coeffs_df_list <- sapply(names(dat)[-1], function(col) {
           f <- as.formula(paste0("count ~ ", col))
    }, simplify = FALSE)

    Online Demo (using glm)