MLR3 : ROC curves and extraction of standard deviation/IC?

I want to extract the standard deviation and/or an IC95 of the result obtain in a benchmark of multiple learners on a task in order to ensure that the results are complete. I read this : mlr3 standard deviation for k-fold cross-validation resampling

But I don't know if since February there was something new...

Here is my code and the plots of the bmr :

resampling_outer = rsmp("cv", folds = 5)
resampling_inner = rsmp("cv", folds = 3)

set.seed(372)
resampling_outer$instantiate(task_wilcox)
resampling_inner$instantiate(task_wilcox)

at_xgboost = auto_tuner(tuner=tnr("mbo"), learner = xgboost,resampling = resampling_inner, measure = msr("classif.auc"),term_evals = 20,store_tuning_instance = TRUE,store_models = TRUE)
at_ranger = auto_tuner(tuner=tnr("mbo"), learner = ranger,resampling = resampling_inner, measure = msr("classif.auc"),term_evals = 20,store_tuning_instance = TRUE,store_models = TRUE)
at_svm = auto_tuner(tuner=tnr("mbo"), learner = svm,resampling = resampling_inner, measure = msr("classif.auc"),term_evals = 20,store_tuning_instance = TRUE,store_models = TRUE)
at_knn = auto_tuner(tuner=tnr("mbo"), learner = knn,resampling = resampling_inner, measure = msr("classif.auc"),term_evals = 20,store_tuning_instance = TRUE,store_models = TRUE)


learners <- c(at_xgboost, at_svm, at_ranger, at_knn)

measures = msrs(c("classif.auc", "classif.bacc", "classif.bbrier"))

#Benchmarking
set.seed(372)
design = benchmark_grid(tasks = task_wilcox, learners = learners, resamplings = resampling_outer)
bmr = benchmark(design, store_models = TRUE)
results <- bmr$aggregate(measures)
print(results)
autoplot(bmr, measure = msr("classif.auc"))
autoplot(bmr, type = "roc")

results
   nr     task_id                learner_id resampling_id iters classif.auc classif.bacc classif.bbrier
1:  1 data_wilcox       scale.xgboost.tuned            cv     5   0.6112939    0.5767294      0.2326787
2:  2 data_wilcox           scale.svm.tuned            cv     5   0.5226407    0.5010260      0.1893202
3:  3 data_wilcox scale.random_forest.tuned            cv     5   0.6200084    0.5614843      0.2229120
4:  4 data_wilcox           scale.knn.tuned            cv     5   0.5731675    0.5002955      0.1917721

extract_inner_tuning_results(bmr)[,list(learner_id, classif.auc)]
                   learner_id classif.auc
 1:       scale.xgboost.tuned   0.6231350
 2:       scale.xgboost.tuned   0.6207103
 3:       scale.xgboost.tuned   0.6175323
 4:       scale.xgboost.tuned   0.6195693
 5:       scale.xgboost.tuned   0.6222398
 6:           scale.svm.tuned   0.5891432
 7:           scale.svm.tuned   0.5837583
 8:           scale.svm.tuned   0.5767444
 9:           scale.svm.tuned   0.6027165
10:           scale.svm.tuned   0.6082825
11: scale.random_forest.tuned   0.6287649
12: scale.random_forest.tuned   0.6165179
13: scale.random_forest.tuned   0.6288599
14: scale.random_forest.tuned   0.6259322
15: scale.random_forest.tuned   0.6234295
16:           scale.knn.tuned   0.5931790
17:           scale.knn.tuned   0.5926835
18:           scale.knn.tuned   0.5931790
19:           scale.knn.tuned   0.5929156
20:           scale.knn.tuned   0.5929156

As you can see on the ROC curve, there are standard deviations or IC with the colored and translucide margin, but how to extract it ? I suppose that for the standard deviation, I have to extract the results on all the outer resampling of my nested CV, but there is no mean to extract it directly (it appears on the ROC curve via the margin, I suppose it exists somewhere...).

Second question, on my plot of the box plots of the AUC of each learner, I don't really know how it builds the box plots since it doesn't correspond to the results on the test set (outer loop => resampling outer)...

Last question : Do you know how to personalize the roc curves in mlr3 ? If I want to add the AUC on the scheme, or delete the margin around curve for example...

Thanks !

Solution

You can get the standard deviations together with the predictions by setting predict_type = "se", see https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-predicting. This is computed by the learner internally and not over folds of a cross-validation, but most likely what you want.
The numbers you get as results should correspond to the plots; without a complete reproducible example I can't tell you more...
All plotting functions in mlr3 return ggplot2 objects; you can customize them in the same way as other plots.