R coefficients of glmnet::cvfit

As far as I am concerned, cvfit does a K fold cross validation, which means that in each time, it separates all the data into training & validation set. For every fixed lambda, first it uses training data to get a coefficient vector. Then implements this constructed model to predict on the validation set to get the error.

Hence, for K fold CV, it has k coefficient vectors (each is generated from a training set). So what does

coef(cvfit)

get?

Here is an example:

x <- iris[1:100,1:4]
y <- iris[1:100,5]
y <- factor(y)

fit <- cv.glmnet(data.matrix(x), y, family = "binomial", type.measure =       "class",alpha=1,nfolds=3,standardize = T)
coef(fit, s=c(fit$lambda.min,fit$lambda.1se))

fit1 <- glmnet(data.matrix(x), y, family = "binomial",
           standardize = T,
           lambda = c(fit$lambda.1se,fit$lambda.min))
coef(fit1)

in fit1, I use the whole dataset as the training set, seems that the coefficients of fit1 and fit are just the same. That's why?

Thanks in advance.

Solution

Although cv.glmnet checks model performance by cross-validation, the actual model coefficients it returns for each lambda value are based on fitting the model with the full dataset.

The help for cv.glmnet (type ?cv.glmnet) includes a Value section that describes the object returned by cv.glmet. The returned list object (fit in your case) includes an element called glmnet.fit. The help describes it like this:

glmnet.fit a fitted glmnet object for the full data.