Search code examples
rfor-loopglmmodel-comparison

Finding model (returned from for loops) with lowest AIC in R


I am trying to find model with lowest AIC. Models are returned from two for loops that make possible combinations of columns. I am unable to make the function return model with lowest AIC. The code below demonstrates where I got stuck:

rm(list = ls())

data <- iris

data <- data[data$Species %in% c("setosa", "virginica"),]

data$Species = ifelse(data$Species == 'virginica', 0, 1)

mod_headers <- names(data[1:ncol(data)-1])

f <- function(mod_headers){
    for(i in 1:length(mod_headers)){
    tab <- combn(mod_headers,i)
    for(j in 1:ncol(tab)){
      tab_new <- c(tab[,j])
      mod_tab_new <- c(tab_new, "Species")
      model <- glm(Species ~., data=data[c(mod_tab_new)], family = binomial(link = "logit"))
    }
    }
  best_model <- model[which(AIC(model)[order(AIC(model))][1])]
  print(best_model)
}

f(mod_headers)

Any suggestions? Thanks!


Solution

  • glm() uses an iterative re-weighted least squares algorithm. The algorithm reaches the maximum number of iterations before it converges - changing this parameter helps in your case:

     glm(Species ~., data=data[mod_tab_new], family = binomial(link = "logit"), control = list(maxit = 50))
    

    There was another issue using which, I replaced it with an if after each model fit to compare to the lowest AIC so far. However, I think there are better solutions than this for-loop approach.

    f <- function(mod_headers){
      lowest_aic <- Inf     # added
      best_model <- NULL    # added
    
      for(i in 1:length(mod_headers)){
        tab <- combn(mod_headers,i)
        for(j in 1:ncol(tab)){
          tab_new <- tab[, j]
          mod_tab_new <- c(tab_new, "Species")
          model <- glm(Species ~., data=data[mod_tab_new], family = binomial(link = "logit"), control = list(maxit = 50))
          if(AIC(model) < lowest_aic){ # added
            lowest_aic <- AIC(model)   # added
            best_model <- model        # added
          }
        }
      }
      return(best_model)
    }