I am trying to understand "glmnet"package. But I still have some questions. 1. what is the meaning of upper number (31, 31, 31.... 3, 2, 2, 2) 2. what is the he vertical dotted lines? why two lines are selected? 3. Why this shows curvilinear pattern??
library(glmnet)
data(MultinomialExample)
cvfit=cv.glmnet(x, y, family="multinomial", type.multinomial = "grouped")
plot(cvfit)
And, the below is plotting of cvfit(result)
Thank you
With crossvalidation, you are trying to find in this case, the best value for lambda for elastic net. Briefly, elastic net is a mixture of lasso and ridge regression, where ridge regression tries to force all your coefficients towards zero. lambda(λ) basically tells you how much to force it towards zero.
On the x-axis you have different lambda values glmnet tried to fit with crossvalidation. On the extreme left you have values that are close to zero, and you would expect all of your coefficents to be non-zero, which is what the numbers on top represent. You can also see this under:
cvfit$nzero
s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 s17 s18 s19
0 1 1 1 1 1 1 2 3 3 7 7 8 8 9 9 9 10 10 10
s20 s21 s22 s23 s24 s25 s26 s27 s28 s29 s30 s31 s32 s33 s34 s35 s36 s37 s38 s39
12 13 14 14 18 18 20 20 21 23 23 25 26 26 26 26 27 27 28 28
s40 s41 s42 s43 s44 s45 s46 s47 s48 s49 s50 s51 s52 s53 s54 s55 s56 s57 s58 s59
29 29 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
s60 s61 s62 s63 s64 s65 s66 s67 s68 s69 s70 s71
30 30 30 30 30 30 30 30 30 30 30 30
which is from the vignette:
nzero: number of non-zero coefficients at each ‘lambda’.
The y-axis is deviance, which tells you how much error happens across all the tested values. The lower it is, the better the predictive ability of your model. You would expect an optimal lambda that gives you the least error in prediction. This is the first line from the left.
cvfit$lambda.min
[1] 0.01291017
The next line is the lambda that uses less coefficients (hence more parsimonious) and is still not too far away from the best predictive model. And this is the second line:
cvfit$lambda.1se
[1] 0.02717467
You can read more it in Friedman et al on this post