Search code examples
mlr3

How to set the graph learner id in mlr3pipelines?


I construct a benchmark with 4 graph learners on 1 dataset. The learner_id of the result of the benchmark is so long because I have some preprocessings. How can I set the learner id so that it wouldn't too long. Here's my code:

# step 1 the task
all_plays <- readRDS("../000files/all_plays.rds")
pbp_task <- as_task_classif(all_plays, target="play_type")
split_task <- partition(pbp_task, ratio=0.75)
task_train <- pbp_task$clone()$filter(split_task$train)
task_test <- pbp_task$clone()$filter(split_task$test)

# step 2 the preprocess
pbp_prep <- po("select", 
               selector = selector_invert(
                 selector_name(c("half_seconds_remaining","yards_gained","game_id")))
               ) %>>%
  po("colapply", 
     affect_columns = selector_name(c("posteam","defteam")),
     applicator = as.factor) %>>% 
  po("filter", 
     filter = mlr3filters::flt("find_correlation"), filter.cutoff=0.3) %>>%
  po("scale", scale = F) %>>% 
  po("removeconstants")

# step 3 learners
rf_glr <- as_learner(pbp_prep %>>% lrn("classif.ranger", predict_type="prob")) 
log_glr <-as_learner(pbp_prep %>>% lrn("classif.log_reg", predict_type="prob")) 
tree_glr <- as_learner(pbp_prep %>>% lrn("classif.rpart", predict_type="prob")) 
kknn_glr <- as_learner(pbp_prep %>>% lrn("classif.kknn", predict_type="prob")) 

# step 4 benckmark grid
set.seed(0520)
cv <- rsmp("cv",folds=10)
design <- benchmark_grid(
  tasks = task_train,
  learners = list(rf_glr,log_glr,tree_glr,kknn_glr),
  resampling = cv
)

# step 5 benchmark
bmr <- benchmark(design,store_models = T)
bmr

# learner_id toooo long...
<BenchmarkResult> of 40 rows with 4 resampling runs
 nr   task_id                                                          learner_id resampling_id
  1 all_plays select.colapply.find_correlation.scale.removeconstants.randomForest            cv
  2 all_plays     select.colapply.find_correlation.scale.removeconstants.logistic            cv
  3 all_plays select.colapply.find_correlation.scale.removeconstants.decisionTree            cv
  4 all_plays         select.colapply.find_correlation.scale.removeconstants.kknn            cv
 iters warnings errors
    10        0      0
    10        0      0
    10        0      0
    10        0      0

The learner_id is too long in this result and it's also bad for autoplot(bmr). How can I set the learner_id to make it short? Thank you very much.


Solution

  • You can do:

    library(mlr3verse)
    #> Loading required package: mlr3
    learner = as_learner(po("pca") %>>% po("learner", lrn("regr.rpart")))
    learner$id = "my_id"
    print(learner)
    #> <GraphLearner:my_id>
    #> * Model: -
    #> * Parameters: regr.rpart.xval=0
    #> * Packages: mlr3, mlr3pipelines, rpart
    #> * Predict Types:  [response], se, distr
    #> * Feature Types: logical, integer, numeric, character, factor, ordered,
    #>   POSIXct
    #> * Properties: featureless, hotstart_backward, hotstart_forward,
    #>   importance, loglik, missings, oob_error, selected_features, weights
    

    Created on 2022-07-22 by the reprex package (v2.0.1)