Search code examples
rloopsapplyglmnetregularized

Simultaneously elastic net regressions


I'm trying to perform Elastic Net in R for multiple variables at the same time. I have 15 Xi and I want to find the elastic net model with the use of each variable as regressor. For one Xi I can perform the following and obtain optimal alpha and lambda with the following

A<-matrix(rnorm(36),nrow=10,ncol = 15)
colnames(A) <- c("X1", "X2", "X3", "X4", "X5", "X6","X7","X8","X9","X10","X11","X12","X13","X14","X15")
A #random data

library(glmnetUtils)
library(glmnet)
library(coefplot)
set.seed(1234)    
 
# Train model.
fit <- cva.glmnet(X1 ~ ., data=A)

# Get alpha.
get_alpha <- function(fit) {
  alpha <- fit$alpha
  error <- sapply(fit$modlist, function(mod) {min(mod$cvm)})
  alpha[which.min(error)]
}


# Get all parameters.
get_model_params <- function(fit) {
  alpha <- fit$alpha
  lambdaMin <- sapply(fit$modlist, `[[`, "lambda.min")
  error <- sapply(fit$modlist, function(mod) {min(mod$cvm)})
  best <- which.min(error)
  data.frame(alpha = alpha[best], lambdaMin = lambdaMin[best])
}

get_model_params(fit) 

I want to perform this procedure simultaneously for all Xi and be able to create 2 dataframes containing a. all optimal min.lambda, b. all optimal alpha, and a list with the coefficients that were produced with the use of the optimal alpha and min.lambda. Can someone help me do that?


Solution

  • You need to loop the function over all of your row combinations:

    loop <- function(data) {
      #make an output dataframe
      output <- as.data.frame(matrix(NA, nrow = ncol(data), ncol = 2))
      colnames(output) <- c('alpha', 'lambdaMin')
      #loop over each column
      for(i in 1:ncol(data)) {
        fit <-  cva.glmnet(data[,-i],data[,i])
        #set the ith row to be the output
        output[i,] = get_model_params(fit)
      }
      output
    }
    
    loop(A)
    

    We use the x,y input to glmnet instead of the formula interface, and use data[,i] and data[,-i] to subset each column out.