Search code examples
rmachine-learningh2o

Is it possible to apply grid search to unsupervised isolation forest in H2O?


I am trying to apply grid search to H2O unsupervised isolation forest in R. Here is my code:

Accesses.hex <- as.h2o(Accesses)

x <- names(Accesses.hex)

seed <- 12345

 

# Model hyperparameters

hyper_params <- list(ntrees = c(50, 100, 150, 200),

                       max_depth = c(8, 15, 20, 30), # default is 8

                       sample_size = c(128, 256, 512))


# Early stopping criteria

search_criteria <- list(strategy = "RandomDiscrete",

                          max_models = 100,

                          max_runtime_secs = 4000,

                          stopping_rounds = 15,

                          seed = seed)

 

model.grid <- h2o.grid(algorithm = "isolationForest",

                         x = x,

                         grid_id = "model_grid",

                         training_frame = Accesses.hex,

                         hyper_params = hyper_params,

                         search_criteria = search_criteria,

                         seed = seed)

However, I got an error saying:

Error in h2o.grid(algorithm = "isolationForest", x = x, grid_id = "model_grid", :

Must specify response, y

I am using isolation forest for unsupervised learning here, so I don’t have the response variable y. Is it possible to do a grid search within H2O in this case?

My computer: OS X 10.14.6, 16 GB memory

H2O cluster version:        3.30.0.1

H2O cluster total nodes:    1

H2O cluster total memory:   15.00 GB

H2O cluster total cores:    16

H2O cluster allowed cores:  16

H2O cluster healthy:        TRUE

R Version:                  R version 3.6.3 (2020-02-29)

Please let me know if there is any other information I can provide. Thanks for your help!


Solution

  • It will not work due to not having a target column with the current design. Isolation forest with grid search support is currently in development and targeted to be released with 3.30.1.1 according to this Jira.