Search code examples
mlr3

Extraction of tuned hyperparameters from tuning instance archive


I've build an automated machine learning system based on the following example:

https://mlr-org.com/gallery/2021-03-11-practical-tuning-series-build-an-automated-machine-learning-system/

I've used the learners xgboost and Random Forest and made use of branching. xgboost gave me the best results during the training phase. Therefore I've extracted the optimized hyperparameters and build the final xgboost model:

lrn = as_learner(graph)
lrn$param_set$values = instance$result_learner_param_vals

I'm also interested in the param_vals of the best performing Random Forest model.

I've thought, I could get the hyperparameters and save the best Random Forest model like that:

Arch = as.data.table(instance$archive, exclude_columns = NULL) # so we keep the uhash
best_RF = Arch[branch.selection == "lrn_ranger"]
best_RF = best_RF[which.min(best_RF$regr.rmse), ] # to get the best RF model from the data table
instance$archive$learner_param_vals(uhash = best_RF$uhash)

lrn_2 = as_learner(graph)
lrn_2$param_set$values = instance$archive$learner_param_vals(uhash = best_RF$uhash)
#lrn_2$param_set$values = instance$archive$learner_param_vals(i = best_RF$batch_nr)

When I use the uhash or the batch_nr I can't retrieve the hyperparameters of the best Random Forest model. I always receive the param_set of the first row in the archive, while the uhash and batch_nr are correct:

$slct_1.selector
selector_name(c("T", "RH"))

$missind.which
[1] "missing_train"

$missind.type
[1] "factor"

$missind.affect_columns
selector_invert(selector_type(c("factor", "ordered", "character")))

$encode.method
[1] "treatment"

$encode.affect_columns
selector_type("factor")

$branch.selection
[1] "lrn_ranger"

$slct_2.selector
selector_name(c("T"))

$mutate.mutation
$mutate.mutation$Tra.Trafo
~(T^(2))/(2)


$mutate.delete_originals
[1] FALSE

$xgboost.nthread
[1] 1

$xgboost.verbose
[1] 0

$ranger.min.node.size
[1] 15

$ranger.mtry
[1] 1

$ranger.num.threads
[1] 1

$ranger.num.trees
[1] 26

$ranger.sample.fraction
[1] 0.8735846

Can somebody give me a hint about how I can reach my goal of the extraction of other hyperparameters, when I'm not only interested in the output of instance$result_learner_param_vals?

Edit:

I wanted to clarify something, which is also related to branching. I'm not sure if it is intended to work like that after reading the comment of @be_marc. Let's use the gallery example I've posted as a reference. I want to compare the results of different tuned branches using a GraphLearner object. I've created the final model like in the gallery example, which is in my case a xgboost model. I also want to create the final models for the other branches for benchmarking purposes. The issue is, that if I don't create a deep clone of the original graph_learner, the original graph_learner gets his value changed for the parameter branch.selection. Why can't I just use a normal clone? Why must it be a deep clone? Is it supposed to work like that? Mostlikely I'm not sure what the difference is between a clone and deep clone.

# Reference for cloning https://mlr3.mlr-org.com/reference/Resampling.html
# equivalent to object called graph_learner in mlr3 gallery example 
graph_learner$param_set$values$branch.selection # original graph_learner object (reference MLR_gallery in first post)

# individually uncomment for different cases
# --------------------------------------------------------------------------------
#MLR_graph = graph # object graph_learner doesn't keeps its original state
#MLR_graph = graph$clone() # object graph_learner doesn't keeps its original state
MLR_graph = graph$clone(deep = TRUE) # object graph_learner keeps its original state
# --------------------------------------------------------------------------------
MLR_graph$param_set$values$branch.selection # value inherited from original graph
MLR_graph$param_set$values$branch.selection = "lrn_MLR" # change set value to other branch
MLR_graph$param_set$values$branch.selection # changed to branch "lrn_MLR"
MLR_lrn = as_learner(MLR_graph) # create a learner from graph with new set branch

# Here we can see the different behaviours based on if we don't clone, clone or deep clone
# at the end, the original graph_learner is supposed to keep it's original state
graph_learner$param_set$values$branch.selection
MLR_lrn$param_set$values$branch.selection

When I don't use a deep clone, the overall best model lrn (jump to the beginning of this post) gets affected too. In my case, it was xgboost. The parameter branch.selection of lrn gets set to lrn_MLR:

print(lrn)

<GraphLearner:slct_1.copy.missind.imputer_num.encode.featureunion.branch.nop_1.nop_2.slct_2.nop_3.nop_4.mutate.xgboost.ranger.MLR.unbranch>
* Model: list
* Parameters: slct_1.selector=<Selector>, missind.which=missing_train, missind.type=factor,
  missind.affect_columns=<Selector>, encode.method=treatment, encode.affect_columns=<Selector>,
  branch.selection=lrn_MLR, slct_2.selector=<Selector>, mutate.mutation=<list>, mutate.delete_originals=FALSE,
  xgboost.alpha=1.891, xgboost.eta=0.06144, xgboost.lambda=0.01341, xgboost.max_depth=3, xgboost.nrounds=122,
  xgboost.nthread=1, xgboost.verbose=0, ranger.num.threads=1
* Packages: mlr3, mlr3pipelines, stats, mlr3learners, xgboost, ranger
* Predict Types:  [response], se, distr
* Feature Types: logical, integer, numeric, character, factor, ordered, POSIXct
* Properties: featureless, hotstart_backward, hotstart_forward, importance, loglik, missings, oob_error,
  selected_features, weights

Edit 2: Okay, I just found out, that I should always use deep clones, when I work with different, distinct learners in an experiment: https://github.com/mlr-org/mlr3/issues/344

The behaviour is intended.


Solution

  • We fixed the bug in the latest dev version (10.09.2022). You can install it with

    remotes::install_github("mlr-org/mlr3")
    

    The learners were not reassembled properly. This works again

    library(mlr3pipelines)
    library(mlr3tuning)
    
    learner = po("subsample") %>>% lrn("classif.rpart", cp = to_tune(0.1, 1))
    
    # hyperparameter tuning on the pima indians diabetes data set
    instance = tune(
      method = "random_search",
      task = tsk("pima"),
      learner = learner,
      resampling = rsmp("cv", folds = 3),
      measure = msr("classif.ce"),
      term_evals = 10
    )
    
    instance$archive$learner_param_vals(i = 1)
    instance$archive$learner_param_vals(i = 2)