Search code examples
rpredictioninterceptglmnet

GLMNET prediction with intercept


I have two questions about prediction using GLMNET - specifically about the intercept.

I made a small example of train data creation, GLMNET estimation and prediction on the train data (which I will later change to Test data):

# Train data creation
Train <- data.frame('x1'=runif(10), 'x2'=runif(10))
Train$y <- Train$x1-Train$x2+runif(10)
# From Train data frame to x and y matrix
y <- Train$y
x <- as.matrix(Train[,c('x1','x2')])
# Glmnet model
Model_El <- glmnet(x,y)
Cv_El <- cv.glmnet(x,y)
# Prediction
Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]
Test_Matrix_Df <- data.frame(Test_Matrix)
Pred_El <- predict(Model_El,newx=Test_Matrix,s=Cv_El$lambda.min,type='response')

I want to have an intercept in the estimated formula. This code gives an error concerning the dimensions of the Test_Matrix matrix unless I remove the (Intercept) column of the matrix - as in

Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]

My questions are:

  • Is it the right way to do this in order to get the prediction - when I want the prediction formula to include the intercept?

  • If it is the right way: Why do I have to remove the intercept in the matrix?

Thanks in advance.


Solution

  • If you want to predict a model with intercept, you have to fit a model with intercept. Your code used model matrix x <- as.matrix(Train[,c('x1','x2')]) which is intercept-free, therefore if you provide an intercept when using predict, you get an error.

    You can do the following:

    x <- model.matrix(y ~ ., Train)    ## model matrix with intercept
    Model_El <- glmnet(x,y)
    Cv_El <- cv.glmnet(x,y)
    Test_Matrix <- model.matrix(y ~ ., Train)   ## prediction matrix with intercept
    Pred_El <- predict(Model_El, newx = Test_Matrix, s = Cv_El$lambda.min, type='response')
    

    Note, you don't have to do

    model.matrix(~ . -y)
    

    model.matrix will ignore the LHS of the formula, so it is legitimate to use

    model.matrix(y ~ .)