Search code examples
pysparklogistic-regressionh2osparkling-waterh2o.ai

Get Stage Results from H2O Sparkling Water model


I am looking to create a confidence interval for one of my model's outputs and I need to get the model outputs before the link function is applied. From what I've read, it seems like I am interested in getting the stage results of the model.

So far I have created a model with the proper parameter, fit it, verified the parameter value, and obtained predictions, but I don't see the information in the detailed predictions column regardless.

estimator = H2OGLMClassifier(family='binomial', featureCols=feature_columns, labelCol=response, withStageResults=True)
model = estimator.fit(training_data)

predictions = model.transform(training_data)

the predictions will then have the detailed_predictions column, but it will only contain JSON with the prediction probabilities, the same as if the stage results weren't listed, e.g.

{
  "label": "1",
  "probabilities": {"0": ".814", "1": ".176"}
}

Is there something else that needs to be done to obtain the stage results? Are the stage results not the correct way to get what I am looking for?

Thanks


Solution

  • If you are looking to get confidence intervals for your results, you are looking for getting confident intervals of the coefficients. To do that, you need to call GLM with compute_p_values, remove_collinear_columns to true. Once the model building process is done, you can model.coef_with_p_values() that will return your model coefficients, the p-values and the std_error and other fields.