Search code examples
azure-machine-learning-serviceazure-ai

Best way to get classification probability from Endpoint deployed from Azure AI Automated ML


I used Azure AI Automated ML to create a classification model, and then deployed an Endpoint from that model. The endpoint gets me back the predicted value, but not the probability. Is there an easy way to add the probability to the output of the endpoint?

I've seen a couple of similar questions, either without answers, or with answers that don't seem to directly apply in this scenario.


Solution

  • You need to use the extra global parameter called predict_proba in the method field of your input body when invoking the endpoint.

    Example:

    {
    "GlobalParameters": {
            "method":"predict_proba"
            },
    "Inputs": {
        "data": [
        {
            "age":35,
            "job":"services",
            "marital":"divorced",
            "education":"university.degree",
            "default":"no"
        },
        ]
    }}
    

    If your endpoint accepts only input_data as a parameter, then you need to create your endpoint using a custom scoring script.

    First, register your best model and download the artifacts. You can refer to this notebook to download artifacts of the best run and register it.

    azureml-examples/sdk/python/jobs/automl-standalone-jobs/automl-classification-task-bankmarketing/automl-classification-task-bankmarketing.ipynb at main · Azure/azureml-examples · GitHub

    Then, when you download the artifacts, you can see the scoring scripts like shown in below image.

    import os
    
    # Create local folder
    local_dir = "./artifact_downloads"
    if not os.path.exists(local_dir):
        os.mkdir(local_dir)
    
    local_path = download_artifacts(
        run_id=best_run.info.run_id, artifact_path="outputs", dst_path=local_dir
    )
    print("Artifacts downloaded in: {}".format(local_path))
    print("Artifacts: {}".format(os.listdir(local_path)))
    

    enter image description here

    Download any one of the scoring files into your local system because this file contains the script which accepts the global parameters. It is not necessary to use the same script; you can develop your script according to your needs. Below is a sample code chunk where it checks for predict or predict_proba.

    def init():
        global model
        # This name is model.id of model that we want to deploy deserialize the model file back
        # into a sklearn model
        model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
        path = os.path.normpath(model_path)
        path_split = path.split(os.sep)
        log_server.update_custom_dimensions({'model_name': path_split[-3], 'model_version': path_split[-2]})
        try:
            logger.info("Loading model from path.")
            model = joblib.load(model_path)
            logger.info("Loading successful.")
        except Exception as e:
            logging_utilities.log_traceback(e, logger)
            raise
    
    @input_schema('GlobalParameters', sample_global_params, convert_to_provided_type=False)
    @input_schema('Inputs', input_sample)
    @output_schema(output_sample)
    def run(Inputs, GlobalParameters={"method": "predict"}):
        data = Inputs['data']
        if GlobalParameters.get("method", None) == "predict_proba":
            result = model.predict_proba(data)
        elif GlobalParameters.get("method", None) == "predict":
            result = model.predict(data)
        else:
            raise Exception(f"Invalid predict method argument received. GlobalParameters: {GlobalParameters}")
        if isinstance(result, pd.DataFrame):
            result = result.values
        return {'Results':result.tolist()}
    

    Next, go to the model tab where you registered the model and deploy it as a real-time endpoint.

    enter image description here

    Then you will get a screen like below. Click on more options to upload the downloaded scoring script.

    enter image description here

    Then, enter the endpoint name, compute type as managed, and Authentication type as Key-based authentication.

    enter image description here

    In Code + environment, turn on Customize environment and scoring script options. Upload the downloaded script and select the required environment.

    enter image description here

    After configuring all the settings, deploy it.

    Now you can test the deployment with the predict_proba parameter and get results.

    Code:

    import pandas as pd
    import json
    
    test_data = pd.read_csv("./data/test-mltable-folder/bank_marketing_test_data.csv")
    test_data = test_data.drop("y", axis=1)
    test_data_json = test_data.to_json(orient="records", indent=4)
    
    data = (
        '{"GlobalParameters":{"method":"predict_proba"},"Inputs": {"data": ' + test_data_json+ '}}'
    )
    
    request_file_name = "sample-request-bankmarketing.json"
    
    with open(request_file_name, "w") as request_file:
        request_file.write(data)
    
    ml_client.online_endpoints.invoke(
        endpoint_name="jgsml-fdnrp", #endpoint-name
        deployment_name="bankmarketing-model-1",#deployment-name
        request_file=request_file_name,
    )
    

    You also refer below documentation on how to deploy auto ml models.

    Deploy an AutoML model with an online endpoint - Azure Machine Learning | Microsoft Learn

    Output:

    Default

    enter image description here

    Giving predict_proba

    enter image description here