I'm trying to use the final model extracted from the trained gbm model, but the extracted model does not return the factorized result as the trained model does. It seems that the extracted final model works according to the returned value, however, it just returns the values of calculation. How can I get the factorized result as the trained model.
library(caret)
library(mlbench)
data(Sonar)
set.seed(7)
Sonar$Class <- ifelse(Sonar$Class == 'R', 0, 1)
Sonar$Class <- as.factor(Sonar$Class)
validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]
outcomename <- 'Class'
predictors <- names(training)[!names(training) %in% outcomename]
set.seed(7)
control <- trainControl(method = "repeatedcv", number = 5, repeats = 5)
model_gbm <- train(training[, predictors], training[, outcomename], method = 'gbm', trControl = control, tuneLength = 10)
predict(model_gbm, validation[,1:60])
[1] 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1
predict(model_gbm$finalModel, validation[,1:60], n.trees = 300)
[1] -3.1174531 -1.8335718 5.0780422 -8.6681791 8.9634393 -1.4079936 11.7232458
[8] 18.4189859 14.3978772 11.3605253 13.4694812 10.2752696 11.4957672 10.0370462
[15] 8.6009983 0.3718381 0.1297673 2.4099186 6.7774090 -10.8356795 -10.1842065
[22] -2.3222431 -8.1525336 -3.3665867 -10.7953353 -2.4607156 -11.4277641 -4.7164270
[29] -6.3882544 -3.7306579 -6.9323133 -4.2643347 -0.2128462 -9.3395850 -13.0759289
[36] -12.8259643 -6.5314340 -12.7968160 -16.6217507 -12.0370978 -3.1100361
The predict.gbm
function has a type
argument, which can be "response" or "link". To get the predicted probabilities one should set it to "response". Then to convert these predictions to a class one can use a threshold value (0.5 is used by caret train). To get and idea here is an example:
library(caret)
library(mlbench)
data(Sonar)
set.seed(7)
validation_index <- createDataPartition(Sonar$Class, p=0.80, list=FALSE)
validation <- Sonar[-validation_index,]
training <- Sonar[validation_index,]
set.seed(7)
control <- trainControl(method = "repeatedcv",
number = 2,
repeats = 2)
model_gbm <- train(Class~.,
data = training,
method = 'gbm',
trControl = control,
tuneLength = 3)
predict using caret:
preds1 <- predict(model_gbm, validation[,1:60], type = "prob")
predict using gbm:
library(gbm)
preds2 <- predict(model_gbm$finalModel, validation[,1:60], n.trees = 100, type = "response")
all.equal(preds1[,1], preds2)
#output
TRUE
or in case of classes:
preds1_class <- predict(model_gbm, validation[,1:60])
to check if they are equal to gbm predictions threshold the predictions:
all.equal(
as.factor(ifelse(preds2 > 0.5, "M", "R")),
preds1_class)
#output
TRUE