Search code examples
rh2omlr

specify `makeNumericVectorParam` for `hidden_dropout_ratios` hyper parameter which would depend on the number of hidden layers


I would like to tune "classif.h2o.deeplearning" learner via mlr. During the tuning I have several architectures I would like explored. For each of these architectures I would like to specify a dropout space. However I am struggling with this.

Example:

library(mlr)
library(h2o)

ctrl <- makeTuneControlRandom(maxit = 10) 

lrn <- makeLearner("classif.h2o.deeplearning", predict.type = "prob")

I define two architectures "a" and "b" via the "hidden" DiscreteParam, for each of them I would like to create a NumericVectorParam of "hidden_dropout_ratios"

par_set <- makeParamSet(
  makeDiscreteParam("hidden", values = list(a = c(16L, 16L),
                                            b = c(16L, 16L, 16L))),
  makeDiscreteParam("activation", values = "RectifierWithDropout", tunable = FALSE),
  makeNumericParam("input_dropout_ratio", lower = 0, upper = 0.4, default = 0.1),
  makeNumericVectorParam("hidden_dropout_ratios", len = 2, lower = 0, upper = 0.6, default = rep(0.3, 2),
                         requires = quote(length(hidden) == 2)),
  makeNumericVectorParam("hidden_dropout_ratios", len = 3, lower = 0, upper = 0.6, default = rep(0.3, 3),
                         requires = quote(length(hidden) == 3)))

this produces an error:

Error in makeParamSet(makeDiscreteParam("hidden", values = list(a = c(16L,  : 
All parameters must have unique names!

Setting just one of them results in dropout being applied only on architectures of appropriate number of hidden layers.

When I attempt to use the same dropout for all hidden layers:

par_set <- makeParamSet(
  makeDiscreteParam("hidden", values = list(a = c(16L, 16L),
                                            b = c(16L, 16L, 16L))),
  makeDiscreteParam("activation", values = "RectifierWithDropout", tunable = FALSE),
  makeNumericParam("input_dropout_ratio", lower = 0, upper = 0.4, default = 0.1),
  makeNumericParam("hidden_dropout_ratios", lower = 0, upper = 0.6, default = 0.3))

tw <- makeTuneWrapper(lrn,
                      resampling = cv3,
                      control = ctrl,
                      par.set = par_set,
                      show.info = TRUE,
                      measures = list(auc,
                                      bac))

perf_tw <- resample(tw, 
                     task = sonar.task,
                     resampling = cv5,
                     extract = getTuneResult,
                     models = TRUE,
                     show.info = TRUE,
                     measures = list(auc,
                                     bac))

I get the error:

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 
ERROR MESSAGE:

Illegal argument(s) for DeepLearning model: DeepLearning_model_R_1566289564965_2.  Details: ERRR on field: _hidden_dropout_ratios: Must have 3 hidden layer dropout ratios.

Perhaps I could overcome this by creating a separate learner for each architecture and then combining with makeModelMultiplexer?

I would like your help in overcoming this. Thanks.

EDIT: I was able to overcome this using makeModelMultiplexer and by creating a learner for each architecture (number of hidden layers).

base_lrn <- list(
  makeLearner("classif.h2o.deeplearning",
              id = "h20_2",
              predict.type = "prob"),
  makeLearner("classif.h2o.deeplearning",
              id = "h20_3",
              predict.type = "prob"))

mm_lrn <- makeModelMultiplexer(base_lrn)

par_set <- makeParamSet(
  makeDiscreteParam("selected.learner", values = extractSubList(base_lrn, "id")),
  makeDiscreteParam("h20_2.hidden", values = list(a = c(16L, 16L),
                                                  b = c(32L, 32L)),
                    requires = quote(selected.learner == "h20_2")),
  makeDiscreteParam("h20_3.hidden", values = list(a = c(16L, 16L, 16L),
                                                  b = c(32L, 32L, 32L)),
                    requires = quote(selected.learner == "h20_3")),
  makeDiscreteParam("h20_2.activation", values = "RectifierWithDropout", tunable = FALSE,
                    requires = quote(selected.learner == "h20_2")),
  makeDiscreteParam("h20_3.activation", values = "RectifierWithDropout", tunable = FALSE,
                    requires = quote(selected.learner == "h20_3")),
  makeNumericParam("h20_2.input_dropout_ratio", lower = 0, upper = 0.4, default = 0.1,
                   requires = quote(selected.learner == "h20_2")),
  makeNumericParam("h20_3.input_dropout_ratio", lower = 0, upper = 0.4, default = 0.1,
                   requires = quote(selected.learner == "h20_3")),
  makeNumericVectorParam("h20_2.hidden_dropout_ratios", len = 2, lower = 0, upper = 0.6, default = rep(0.3, 2),
                         requires = quote(selected.learner == "h20_2")),
  makeNumericVectorParam("h20_3.hidden_dropout_ratios", len = 3, lower = 0, upper = 0.6, default = rep(0.3, 3),
                         requires = quote(selected.learner == "h20_3")))

tw <- makeTuneWrapper(mm_lrn,
                      resampling = cv3,
                      control = ctrl,
                      par.set = par_set,
                      show.info = TRUE,
                      measures = list(auc,
                                      bac))

perf_tw <- resample(tw, 
                    task = sonar.task,
                    resampling = cv5,
                    extract = getTuneResult,
                    models = TRUE,
                    show.info = TRUE,
                    measures = list(auc,
                                    bac))

Is there a more elegant solution?


Solution

  • I've no experience with h2o learners or their deep learning approach.

    However, specifying the same parameter twice in a single ParamSet (as your first try) won't work. So you will always need to use two ParamSets anyways.

    I cannot say anything about the second error you are getting. This looks like a h2o related problem.

    Using makeModelMultiplexer() is one option. You can also use single benchmark() calls and aggregate them afterwards.