Search code examples
random-forestxgboosthyperparametersmlradaboost

Hyperparameter tuning; what parameter space for ML algorithms (rf, adaboost, xgboost)


Im trying to tune the hyperparameters of several ML algorithms (rf, adaboost and xgboost) to train a model with a multiclass classification variable as target. Im working with the MLR package in R. However, Im not sure about the following.

  • which hyperparameters to tune (and for which hyperparameters to use the default)
  • what should be the space for the hyperparameters which are tuned

Do you know any sources where I can find information about this?

For example;

filterParams(getParamSet("classif.randomForest"), tunable = TRUE)

Gives

                    Type  len   Def   Constr Req Tunable Trafo
ntree            integer    -   500 1 to Inf   -    TRUE     -
mtry             integer    -     - 1 to Inf   -    TRUE     -
replace          logical    -  TRUE        -   -    TRUE     -
classwt    numericvector <NA>     - 0 to Inf   -    TRUE     -
cutoff     numericvector <NA>     -   0 to 1   -    TRUE     -
sampsize   integervector <NA>     - 1 to Inf   -    TRUE     -
nodesize         integer    -     1 1 to Inf   -    TRUE     -
maxnodes         integer    -     - 1 to Inf   -    TRUE     -
importance       logical    - FALSE        -   -    TRUE     -
localImp         logical    - FALSE        -   -    TRUE     -

Space; lower, upper, transformation

params_to_tune <- makeParamSet(makeNumericParam("mtry", lower = 0, upper = 1, trafo = function(x) ceiling(x*ncol(train_x))))

Solution

  • In general, you want to tune all the parameters that are marked tunable with value ranges as large as you can afford. In practice, some of these won't make a difference in terms of performance, but you usually don't know that beforehand.