I trained a GBM in h2o using early stopping and setting ntrees=10000
. I want to retrieve the number of trees are actually in the model. But if I called model.params['ntrees']
(where model
is the best model from a gridsearch) I get
{'default': 50, 'actual': 10000}
where the 10000
is the parameter I set during training but not the actual number of trees that ended up in the model.
If I call model.score_history()
then i can see that early stopping kicked in at 280
trees. But surely there is a more direct way to find out the actual number of trees in the model than this hack:
best_model.score_history()['number_of_trees'].max()
There currently isn't a clean way to do this. An alternative method, which doesn't require calculating a max but is still clunk to do is model.summary()['number_of_trees'][0]
if you want the number, model.summary()['number_of_trees']
if you want the number in a list. Or just model.summary()
if you just want to see the number.