As we can see that caret::train(..., method = "glmnet")
with cross-validation or cv.glmnet()
implemented both could find the lambda.min
which minimize the cross-validation error. The final best fitted model should be the one fitted with lambda.min
. Then, why do we need to set a grid of lambda
values to the training process?
We use a custom tuning grid for a glmnet
model, because the default tuning grid is very small and there are many more potential glmnet
models we may want to explore.
glmnet
is capable of fitting 2 different kinds of penalized models, and it has 2 tuning parameters:
The glmnet
model can fit many models at once (for single alpha
, all values of lambda
fit simultaneously), we can pass a large number of lambda
values which control the amount of penalization in the model.
train()
is smart enough to only fit one model per alpha
value and pass all of the lambda
values at one for simultaneous fitting.
Example:
# Make a custom tuning grid
tuneGrid <- expand.grid(alpha = 0:1, lambda = seq(0.0001, 1, length = 10))
# Fit a model
model <- train(y ~ ., overfit, method = "glmnet",
tuneGrid = tuneGrid, trControl = myControl
)
# Sample Output
Warning message: The metric "Accuracy" was not in the result set. ROC will be used instead.
+ Fold01: alpha=0, lambda=1
- Fold01: alpha=0, lambda=1
+ Fold01: alpha=1, lambda=1
- Fold01: alpha=1, lambda=1
+ Fold02: alpha=0, lambda=1
- Fold02: alpha=0, lambda=1
+ Fold02: alpha=1, lambda=1
- Fold02: alpha=1, lambda=1
+ Fold03: alpha=0, lambda=1
- Fold03: alpha=0, lambda=1
+ Fold03: alpha=1, lambda=1
- Fold03: alpha=1, lambda=1
+ Fold04: alpha=0, lambda=1
- Fold04: alpha=0, lambda=1
+ Fold04: alpha=1, lambda=1
- Fold04: alpha=1, lambda=1
+ Fold05: alpha=0, lambda=1
- Fold05: alpha=0, lambda=1
+ Fold05: alpha=1, lambda=1
- Fold05: alpha=1, lambda=1
+ Fold06: alpha=0, lambda=1
- Fold06: alpha=0, lambda=1
+ Fold06: alpha=1, lambda=1
- Fold06: alpha=1, lambda=1
+ Fold07: alpha=0, lambda=1
- Fold07: alpha=0, lambda=1
+ Fold07: alpha=1, lambda=1
- Fold07: alpha=1, lambda=1
+ Fold08: alpha=0, lambda=1
- Fold08: alpha=0, lambda=1
+ Fold08: alpha=1, lambda=1
- Fold08: alpha=1, lambda=1
+ Fold09: alpha=0, lambda=1
- Fold09: alpha=0, lambda=1
+ Fold09: alpha=1, lambda=1
- Fold09: alpha=1, lambda=1
+ Fold10: alpha=0, lambda=1
- Fold10: alpha=0, lambda=1
+ Fold10: alpha=1, lambda=1
- Fold10: alpha=1, lambda=1
Aggregating results
Selecting tuning parameters
Fitting alpha = 1, lambda = 1 on full training set
# Print model to console
model
# Sample Output
glmnet
250 samples
200 predictors
2 classes: 'class1', 'class2'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 225, 225, 225, 225, 224, 226, ...
Resampling results across tuning parameters:
alpha lambda ROC Sens Spec
0 0.0001 0.3877717 0.00 0.9786232
0 0.1112 0.4352355 0.00 1.0000000
0 0.2223 0.4546196 0.00 1.0000000
0 0.3334 0.4589674 0.00 1.0000000
0 0.4445 0.4718297 0.00 1.0000000
0 0.5556 0.4762681 0.00 1.0000000
0 0.6667 0.4783514 0.00 1.0000000
0 0.7778 0.4826087 0.00 1.0000000
0 0.8889 0.4869565 0.00 1.0000000
0 1.0000 0.4869565 0.00 1.0000000
1 0.0001 0.3368659 0.05 0.9188406
1 0.1112 0.5000000 0.00 1.0000000
1 0.2223 0.5000000 0.00 1.0000000
1 0.3334 0.5000000 0.00 1.0000000
1 0.4445 0.5000000 0.00 1.0000000
1 0.5556 0.5000000 0.00 1.0000000
1 0.6667 0.5000000 0.00 1.0000000
1 0.7778 0.5000000 0.00 1.0000000
1 0.8889 0.5000000 0.00 1.0000000
1 1.0000 0.5000000 0.00 1.0000000
ROC was used to select the optimal model using the largest value.
The final values used for the model were alpha = 1 and lambda = 1.
# Plot model
plot(model)