Search code examples
rregressionlinear-regressionglmnet

How can I force cv.glmnet not to drop one specific variable?


I am running a regression with 67 observasions and 32 variables. I am doing variable selection using cv.glmnet function from the glmnet package. There is one variable I want to force into the model. (It is dropped during normal procedure.) How can I specify this condition in cv.glmnet?

Thank you!

My code looks like the following:

glmntfit <- cv.glmnet(mydata[,-1], mydata[,1])
coef(glmntfit, s=glmntfit$lambda.1se)

And the variable I want is mydata[,2].


Solution

  • This can be achieved by providing a penalty.factor vector, as described in ?glmnet. A penalty factor of 0 indicates that the "variable is always included in the model", while 1 is the default.

    glmntfit <- cv.glmnet(mydata[,-1], mydata[, 1], 
                          penalty.factor=c(0, rep(1, ncol(mydata) - 2)))