Best way to get classification probability from Endpoint deployed from Azure AI Automated ML

I used Azure AI Automated ML to create a classification model, and then deployed an Endpoint from that model. The endpoint gets me back the predicted value, but not the probability. Is there an easy way to add the probability to the output of the endpoint?

I've seen a couple of similar questions, either without answers, or with answers that don't seem to directly apply in this scenario.

Solution

You need to use the extra global parameter called predict_proba in the method field of your input body when invoking the endpoint.

Example:

{
"GlobalParameters": {
        "method":"predict_proba"
        },
"Inputs": {
    "data": [
    {
        "age":35,
        "job":"services",
        "marital":"divorced",
        "education":"university.degree",
        "default":"no"
    },
    ]
}}

If your endpoint accepts only input_data as a parameter, then you need to create your endpoint using a custom scoring script.

First, register your best model and download the artifacts. You can refer to this notebook to download artifacts of the best run and register it.

azureml-examples/sdk/python/jobs/automl-standalone-jobs/automl-classification-task-bankmarketing/automl-classification-task-bankmarketing.ipynb at main · Azure/azureml-examples · GitHub

Then, when you download the artifacts, you can see the scoring scripts like shown in below image.

import os

# Create local folder
local_dir = "./artifact_downloads"
if not os.path.exists(local_dir):
    os.mkdir(local_dir)

local_path = download_artifacts(
    run_id=best_run.info.run_id, artifact_path="outputs", dst_path=local_dir
)
print("Artifacts downloaded in: {}".format(local_path))
print("Artifacts: {}".format(os.listdir(local_path)))

enter image description here

Download any one of the scoring files into your local system because this file contains the script which accepts the global parameters. It is not necessary to use the same script; you can develop your script according to your needs. Below is a sample code chunk where it checks for predict or predict_proba.

def init():
    global model
    # This name is model.id of model that we want to deploy deserialize the model file back
    # into a sklearn model
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
    path = os.path.normpath(model_path)
    path_split = path.split(os.sep)
    log_server.update_custom_dimensions({'model_name': path_split[-3], 'model_version': path_split[-2]})
    try:
        logger.info("Loading model from path.")
        model = joblib.load(model_path)
        logger.info("Loading successful.")
    except Exception as e:
        logging_utilities.log_traceback(e, logger)
        raise

@input_schema('GlobalParameters', sample_global_params, convert_to_provided_type=False)
@input_schema('Inputs', input_sample)
@output_schema(output_sample)
def run(Inputs, GlobalParameters={"method": "predict"}):
    data = Inputs['data']
    if GlobalParameters.get("method", None) == "predict_proba":
        result = model.predict_proba(data)
    elif GlobalParameters.get("method", None) == "predict":
        result = model.predict(data)
    else:
        raise Exception(f"Invalid predict method argument received. GlobalParameters: {GlobalParameters}")
    if isinstance(result, pd.DataFrame):
        result = result.values
    return {'Results':result.tolist()}

Next, go to the model tab where you registered the model and deploy it as a real-time endpoint.

enter image description here

Then you will get a screen like below. Click on more options to upload the downloaded scoring script.

enter image description here

Then, enter the endpoint name, compute type as managed, and Authentication type as Key-based authentication.

enter image description here

In Code + environment, turn on Customize environment and scoring script options. Upload the downloaded script and select the required environment.

enter image description here

After configuring all the settings, deploy it.

Now you can test the deployment with the predict_proba parameter and get results.

Code:

import pandas as pd
import json

test_data = pd.read_csv("./data/test-mltable-folder/bank_marketing_test_data.csv")
test_data = test_data.drop("y", axis=1)
test_data_json = test_data.to_json(orient="records", indent=4)

data = (
    '{"GlobalParameters":{"method":"predict_proba"},"Inputs": {"data": ' + test_data_json+ '}}'
)

request_file_name = "sample-request-bankmarketing.json"

with open(request_file_name, "w") as request_file:
    request_file.write(data)

ml_client.online_endpoints.invoke(
    endpoint_name="jgsml-fdnrp", #endpoint-name
    deployment_name="bankmarketing-model-1",#deployment-name
    request_file=request_file_name,
)

You also refer below documentation on how to deploy auto ml models.

Deploy an AutoML model with an online endpoint - Azure Machine Learning | Microsoft Learn

Output:

Default

enter image description here

Giving predict_proba

enter image description here