Search code examples
rarraysglmnet

Q: How to apply a multiple LASSO regression?


I am working with two matrices and my idea is using the LASSO methodology with subsamples. In order to do this, I write the following code:

library(glmnet)

iv_vardos<-matrix(1:3500, ncol = 30, nrow=3500)
colnames(iv_vardos)<-c("ReV_d_NSW","ReV_w_NSW","ReV_m_NSW","ReV_d_QLD" ,"ReV_w_QLD","ReV_m_QLD",
"ReV_d_SA","ReV_w_SA","ReV_m_SA","ReV_d_VIC","ReV_w_VIC","ReV_m_VIC","ReCov_d_QLD_NSW",
"ReCov_w_QLD_NSW", "ReCov_m_QLD_NSW", "ReCov_d_SA_NSW", "ReCov_w_SA_NSW", "ReCov_m_SA_NSW", "ReCov_d_SA_QLD", 
"ReCov_w_SA_QLD","ReCov_m_SA_QLD","ReCov_d_SA_VIC","ReCov_w_SA_VIC","ReCov_m_SA_VIC",
"ReCov_d_VIC_NSW", "ReCov_w_VIC_NSW", "ReCov_m_VIC_NSW", "ReCov_d_VIC_QLD", "ReCov_w_VIC_QLD", "ReCov_m_VIC_QLD")

dv_vardos<-matrix(3501:7000, ncol = 10, nrow=3500)
colnames(dv_vardos)<-c("NSW","QLD","VIC","SA","QLD_NSW","SA_NSW","SA_QLD","SA_VIC","VIC_NSW","VIC_QLD")

space <- 3490

#With the following code I want to estimate 10 LASSO coefficients with 10 different samples.

Dinamic_betas<-array(NA, c(10, 30, (nrow(iv_vardos)-space)))
for (i in 1:dim(Dinamic_betas)[3]) {
  cv <- cv.glmnet(iv_vardos[i:(space+i-1),], dv_vardos[i:(space+i-1),], alpha = 1, family = "mgaussian")
  LASSO_estimation<- glmnet(iv_vardos[i:(space+i-1),], dv_vardos[i:(space+i-1),], alpha = 1, lambda = cv$lambda.min, family = "mgaussian")
    #My problem starts here when I want to store the LASSO coefficients. The object LASSO_estimation is a list of different objects. 
#The most important for me is beta. A list which contains the betas estimated after use LASSO methodology
for (j in 1:Dinamic_betas[3]) {
  Dinamic_betas[1,,j]<-t(as.vector(LASSO_estimation[["beta"]][["NSW"]]))
  Dinamic_betas[2,,j]<-t(as.vector(LASSO_estimation[["beta"]][["QLD"]]))
  Dinamic_betas[3,,j]<-t(as.vector(LASSO_estimation[["beta"]][["VIC"]]))
  Dinamic_betas[4,,j]<-t(as.vector(LASSO_estimation[["beta"]][["SA"]]))
  Dinamic_betas[5,,j]<-t(as.vector(LASSO_estimation[["beta"]][["QLD_NSW"]]))
  Dinamic_betas[6,,j]<-t(as.vector(LASSO_estimation[["beta"]][["SA_NSW"]]))
  Dinamic_betas[7,,j]<-t(as.vector(LASSO_estimation[["beta"]][["SA_QLD"]]))
  Dinamic_betas[8,,j]<-t(as.vector(LASSO_estimation[["beta"]][["SA_VIC"]]))
  Dinamic_betas[9,,j]<-t(as.vector(LASSO_estimation[["beta"]][["VIC_NSW"]]))
  Dinamic_betas[10,,j]<-t(as.vector(LASSO_estimation[["beta"]][["VIC_QLD"]]))
 }
}

With this code I want to create an array 10x30x10, but each time that I run the code I have the following error:

Error in 1:Dinamic_betas[3] : NA/NaN argument 

Solution

  • Ok I think I get what you are trying to do. You can collect all the coefficients by using do.call to bind the coefficients. Then it's a matter of slotting them into the array you have created:

    Dinamic_betas<-array(NA, c(10, 30, (nrow(iv_vardos)-space)))
    
    for (i in 1:dim(Dinamic_betas)[3]) {
      cv <- cv.glmnet(iv_vardos[i:(space+i-1),], 
      dv_vardos[i:(space+i-1),], alpha = 1, family = "mgaussian")
    
      LASSO_estimation<- glmnet(iv_vardos[i:(space+i-1),], dv_vardos[i:(space+i-1),], 
      alpha = 1, lambda = cv$lambda.min, family = "mgaussian")
    
      coefs <- as.matrix(do.call(cbind,LASSO_estimation$beta))
      colnames(coefs) <- names(LASSO_estimation$beta)
      Dinamic_betas[,,i] <- t(coefs)
    
    }
    

    Just a suggestion, you can try using list to store your coefficients and also use something like purrr to iterate through.. Arrays are a bit inflexible sometimes.