I am conducting a benchmark analysis using the mlr
package and would like to use auc
as my performance measure. I have specified predict.type = "prob"
and am still getting the following error message:
0001: Error in FUN(X[[i]], ...) :
Measure auc requires predict type to be: 'prob'!
My code:
#define measures
meas <- list(acc, mlr::auc, brier)
##random forest
p_length <- ncol(training_complete) - 1
lrn_RF = makeLearner("classif.randomForest", predict.type = "prob", par.vals = list("ntree" = 500L))
wcw_lrn_RF = makeWeightedClassesWrapper(lrn_RF, wcw.weight = 0.10) #weighted class wrapper
parsRF = makeParamSet(
makeIntegerParam("mtry", lower = 1 , upper = floor(0.4*p_length)),
makeIntegerParam("nodesize", lower = 10, upper = 50))
tuneRF = makeTuneControlRandom(maxit = 100)
inner = makeResampleDesc("CV", iters = 2)
learnerRF = makeTuneWrapper(lrn_RF, resampling = inner, meas, par.set = parsRF, control = tuneRF, show.info = FALSE)
##extreme gradient boosting
lrn_xgboost <- makeLearner(
"classif.xgboost",
predict.type = "prob", #before was response
par.vals = list(objective = "binary:logistic", eval_metric = "error", nrounds = 200))
getParamSet("classif.xgboost")
pars_xgboost = makeParamSet(
makeIntegerParam("nrounds", lower = 100, upper = 500),
makeIntegerParam("max_depth", lower = 1, upper = 10),
makeNumericParam("eta", lower = .1, upper = .5),
makeNumericParam("lambda", lower = -1, upper = 0, trafo = function(x) 10^x))
tunexgboost = makeTuneControlRandom(maxit = 50)
inner = makeResampleDesc("CV", iters = 2)
learnerxgboost = makeTuneWrapper(lrn_xgboost, resampling = inner, meas, par.set = pars_xgboost,control = tunexgboost, show.info = FALSE)
##Benchmarking via outer resampling loop
#Learners to be compared
lrns = list(
makeLearner("classif.featureless"),
learnerRF,
learnerxgboost
)
#outer resampling strategy
rdesc = makeResampleDesc("CV", iters = 5)
library(methods)
library(parallel)
library(parallelMap)
set.seed(123, "L'Ecuyer")
parallelStartSocket(parallel::detectCores(), level = "mlr.resample")
churn_benchmarking <- benchmark(learners = lrns,
tasks = trainTask,
resamplings = rdesc,
models = FALSE,
measures = meas)
parallelStop()
Any hint is highly appreciated!
I can see one problem. Your featureless learner is not providing probabilities.
Write makeLearner("classif.featureless", predict.type = "prob")
instead.