I used Azure AI Automated ML to create a classification model, and then deployed an Endpoint from that model. The endpoint gets me back the predicted value, but not the probability. Is there an easy way to add the probability to the output of the endpoint?
I've seen a couple of similar questions, either without answers, or with answers that don't seem to directly apply in this scenario.
You need to use the extra global parameter called predict_proba
in the method field of your input body when invoking the endpoint.
Example:
{
"GlobalParameters": {
"method":"predict_proba"
},
"Inputs": {
"data": [
{
"age":35,
"job":"services",
"marital":"divorced",
"education":"university.degree",
"default":"no"
},
]
}}
If your endpoint accepts only input_data
as a parameter, then you need to create your endpoint using a custom scoring script.
First, register your best model and download the artifacts. You can refer to this notebook to download artifacts of the best run and register it.
Then, when you download the artifacts, you can see the scoring scripts like shown in below image.
import os
# Create local folder
local_dir = "./artifact_downloads"
if not os.path.exists(local_dir):
os.mkdir(local_dir)
local_path = download_artifacts(
run_id=best_run.info.run_id, artifact_path="outputs", dst_path=local_dir
)
print("Artifacts downloaded in: {}".format(local_path))
print("Artifacts: {}".format(os.listdir(local_path)))
Download any one of the scoring files into your local system because this file contains the script which accepts the global parameters. It is not necessary to use the same script; you can develop your script according to your needs.
Below is a sample code chunk where it checks for predict
or predict_proba
.
def init():
global model
# This name is model.id of model that we want to deploy deserialize the model file back
# into a sklearn model
model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
path = os.path.normpath(model_path)
path_split = path.split(os.sep)
log_server.update_custom_dimensions({'model_name': path_split[-3], 'model_version': path_split[-2]})
try:
logger.info("Loading model from path.")
model = joblib.load(model_path)
logger.info("Loading successful.")
except Exception as e:
logging_utilities.log_traceback(e, logger)
raise
@input_schema('GlobalParameters', sample_global_params, convert_to_provided_type=False)
@input_schema('Inputs', input_sample)
@output_schema(output_sample)
def run(Inputs, GlobalParameters={"method": "predict"}):
data = Inputs['data']
if GlobalParameters.get("method", None) == "predict_proba":
result = model.predict_proba(data)
elif GlobalParameters.get("method", None) == "predict":
result = model.predict(data)
else:
raise Exception(f"Invalid predict method argument received. GlobalParameters: {GlobalParameters}")
if isinstance(result, pd.DataFrame):
result = result.values
return {'Results':result.tolist()}
Next, go to the model tab where you registered the model and deploy it as a real-time endpoint.
Then you will get a screen like below. Click on more options to upload the downloaded scoring script.
Then, enter the endpoint name, compute type as managed, and Authentication type as Key-based authentication.
In Code + environment, turn on Customize environment and scoring script options. Upload the downloaded script and select the required environment.
After configuring all the settings, deploy it.
Now you can test the deployment with the predict_proba
parameter and get results.
Code:
import pandas as pd
import json
test_data = pd.read_csv("./data/test-mltable-folder/bank_marketing_test_data.csv")
test_data = test_data.drop("y", axis=1)
test_data_json = test_data.to_json(orient="records", indent=4)
data = (
'{"GlobalParameters":{"method":"predict_proba"},"Inputs": {"data": ' + test_data_json+ '}}'
)
request_file_name = "sample-request-bankmarketing.json"
with open(request_file_name, "w") as request_file:
request_file.write(data)
ml_client.online_endpoints.invoke(
endpoint_name="jgsml-fdnrp", #endpoint-name
deployment_name="bankmarketing-model-1",#deployment-name
request_file=request_file_name,
)
You also refer below documentation on how to deploy auto ml models.
Deploy an AutoML model with an online endpoint - Azure Machine Learning | Microsoft Learn
Output:
Default
Giving predict_proba