I used the caret and glmnet pacakges to run a lasso logistic regression using repeated cross validation to select the optimized minimum lambda.
glmnet.obj <- train(outcome ~ .,
data = df.train,
method = "glmnet",
metric = "ROC",
family = "binomial",
trControl = trainControl(
method = "repeatedcv",
repeats = 10,
number = 10,
summaryFunction = twoClassSummary,
classProbs = TRUE,
savePredictions = "all",
selectionFunction = "best"))
After that, I get the best lambda and alpha:
best_lambda<- get_best_result(glmnet.obj)$lambda
best_alpha<- get_best_result(glmnet.obj)$alpha
Then I obtain the predicted probabilities for the test set:
pred_prob<- predict(glmnet.obj,s=best_lambda, alpha=best_alpha, type="prob", newx = x.test)
and then to get the predicted classes, which I intend to use in ConfusionMatrix:
pred_class<-predict(glmnet.obj,s=best_lambda, alpha=best_alpha, type="raw",newx=x.test)
But when I just run pred_class
it returns NULL
.
What could I be missing here?
You need to use newdata =
as opposed to newx=
because when you do predict(glmnet.obj)
, it is calling predict.train
on the caret object.
You did not provide one function, but I suppose it is rom this source:
get_best_result = function(caret_fit) {
best = which(rownames(caret_fit$results) == rownames(caret_fit$bestTune))
best_result = caret_fit$results[best, ]
rownames(best_result) = NULL
best_result
}
Using an example data
set.seed(111)
df = data.frame(outcome = factor(sample(c("y","n"),100,replace=TRUE)),
matrix(rnorm(1000),ncol=10))
colnames(df.train)[-1] = paste0("col",1:10)
df.train = df[1:70,]
x.test = df[71:100,]
And we run your model, then you can predict using the function:
pred_class<-predict(glmnet.obj,type="raw",newdata=x.test)
confusionMatrix(table(pred_class,x.test$outcome))
Confusion Matrix and Statistics
pred_class n y
n 1 5
y 11 13
The arguments for lambda =
and newx=
comes from glmnet, you can potentially use it on glmnet.obj$finalModel , but you need to convert the data into a matrix, for example:
predict(glmnet.obj$finalModel,s=best_lambda, alpha=best_alpha,
type="class",newx=as.matrix(x.test[,-1]))