Search code examples
rglmnetmse

Elastic Net Regression (Prostate Data)


In the original Elastic net paper, Zou and Hastie, (2005) examined prostate cancer data for comparison purposes. I would like to regenerate the results using glmnet package in R. As mentioned in the paper, the response is lpsa. The training and test sets are given by the variable train in the data. I assumed alpha = 0.26 (as in the paper) and used cross validation to estimate lambda. But I could not get a similar mean squared error to the one given in the paper (which is 0.381). Where is my mistake?

The code I used is given below.

library(ElemStatLearn)
library(glmnet)
x  = model.matrix(lpsa ~ .-train, data = prostate)[, -1]
y = prostate$lpsa
#
trainlab = which(prostate$train=="TRUE")
testlab = which(prostate$train=="FALSE")
y.test = y[testlab]

alph=0.26
en.mod = glmnet(x[trainlab, ], y[trainlab], alpha = alph)

set.seed(1)
cv.out = cv.glmnet(x[trainlab, ], y[trainlab], alpha = alph)
bestlambda=cv.out$lambda.min
en.pred = predict(en.mod, s=bestlambda, newx = x[testlab, ])
MSE.en = mean((en.pred-y.test)^2)
MSE.en
[1] 0.5043356

Solution

  • According to the paper, they used an algorithm called LARS-EN, so you might be interested to check in the package called elasticnet, as it implements that algorithm.