Search code examples
python-3.xclassificationh2ogbm

Predicted class and the corresponding probability is contradictory on H2O


I applied a binary classification using H2O. I simply divided my set into 3 which are train, calibrate and test. After training and calibration I checked the results on the test set. Here the corresponding part:

final_grid = H2OGridSearch(model=H2OGradientBoostingEstimator(model_id = 'contract_gbm2', 
                                    stopping_rounds = 5, stopping_tolerance = 1e-4, seed = 23,
                                    stopping_metric = "AUC",balance_classes = True,
                                    max_runtime_secs=300, calibrate_model=True, calibration_frame=valid,

                                    nfolds = 5),
                       hyper_params=hyper_params_gbm,search_criteria=search_criteria)

What I have noticed is that the predicted class and the given probabilities are not always consistent. See below:

enter image description here

As seen the prediction is not decided based on the highest probability? What am I missing?


Solution

  • The threshold is max-F1, not 0.5.

    If you dont like that threshold, of course, then you can compare p1 with whatever threshold you like.