Search code examples
pythonh2o

How to access to elements of confusion matrix in H2O for python?


I made a grid search that contains 36 models.

For each model the confusion matrix is available with :

grid_search.get_grid(sort_by='a_metrics', decreasing=True)[index].confusion_matrix(valid=valid_set)

My problematic is I only want to access some parts of this confusion matrix in order to make my own ranking, which is not natively available with h2o.

Let's say we have the confusion_matrix of the first model of the grid_search below:

+---+-------+--------+--------+--------+------------------+
|   |   0   |   1    | Error  |  Rate  |                  |
+---+-------+--------+--------+--------+------------------+
| 0 | 0     |  766.0 | 2718.0 | 0.7801 | (2718.0/3484.0)  |
| 1 | 1     |  351.0 | 6412.0 | 0.0519 | (351.0/6763.0)   |
| 2 | Total | 1117.0 | 9130.0 | 0.2995 | (3069.0/10247.0) |
+---+-------+--------+--------+--------+------------------+

Actually, the only things that really interest me is the precision of the class 0 as 766/1117 = 0,685765443. While h2o consider precision metrics for all the classes and it is done to the detriment of what I am looking for.

I tried to convert it in dataframe with:

model = grid_search.get_grid(sort_by='a_metrics', decreasing=True)[0]
model.confusion_matrix(valid=valid_set).as_data_frame()

Even if some topics on internet suggest it works, actually it does not (or doesn't anymore):

AttributeError: 'ConfusionMatrix' object has no attribute 'as_data_frame'

I search a way to return a list of attributes of the confusion_matrix without success.


Solution

  • According to H2O documentation there is no as_dataframe method: http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/_modules/h2o/model/confusion_matrix.html

    I assume the easiest way is to call to_list().