Search code examples
rr-caretlasso-regression

Regarding preprocessing in LASSO using Caret package in R


I fiited the same LASSO logistic regression model without pre-processing and with pre-processing . I used 5 fold cross validation.

However i am getting the same value for the optimal tuning parameter.

My code is as follows,

Without pre-processing

require(ISLR)
require(caret)
set.seed(123)
fitControl <- trainControl(method = "cv",number = 5,savePredictions = T)
mod_fitg <- train(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Volume,
                 data=Smarket, method = "glmnet", 
                 trControl = fitControl,
                 tuneGrid=expand.grid(
                   .alpha=1,
                   .lambda=10^seq(-5, 5, length =100)),
                 family="binomial")

mod_fitg$bestTune

> mod_fitg$bestTune
   alpha      lambda
25     1 0.002656088

With pre-processing

set.seed(123)
fitControl <- trainControl(method = "cv",number = 5,savePredictions = T)
mod_fitgc <- train(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Volume,
                  data=Smarket, method = "glmnet", 
                  trControl = fitControl,
                  tuneGrid=expand.grid(
                    .alpha=1,
                    .lambda=10^seq(-5, 5, length =100)),
                  family="binomial",preProcess = c("center", "scale"))

mod_fitgc$bestTune

> mod_fitgc$bestTune
   alpha      lambda
25     1 0.002656088

Is it possible to know whether i did any mistake here ?

DO I using caret package correctly ?

I fitted other models like SVM or KNN using caret package. For those models i got different results after pre processing.

Thank you


Solution

  • Your code is fine. You kind of answered your question I fitted other models like SVM or KNN using caret package. For those models i got different results after pre processing. Here is some reference material that might answer your question.

    https://stats.stackexchange.com/questions/33674/why-do-lars-and-glmnet-give-different-solutions-for-the-lasso-problem