azure azure-application-insights azure-aks azure-machine-learning-service

Azure Machine Learning application logs to application insights with custom dimensions

How can you log custom dimensions to Application Insights from your run or init functions in score.py for Azure Machine Learning inference?

While using the basic logger, all logs within init and run functions are lost. I attempted to address this issue by employing the AzureLogHandler. Although this enabled logging to Application Insights, I discovered that I need to instantiate the logger within both init and run functions for it to function correctly.

Additionally, how can I link my requestId to these logs in Application Insights to track which inference request this log is generated for?

P.S. I'm using python azure machine learning SDK V2 and a Kubernetes (AKS) backed online endpoint.

Solution

In Azure AutoML itself, there is logging. Below is a sample scoring script:

This logs the data in endpoint logs and Application Insights associated with your Azure ML workspace endpoint.

Here is the sample code:

import json
import logging
import os
import pickle
import numpy as np
import pandas as pd
import joblib

import azureml.automl.core
from azureml.automl.core.shared import logging_utilities, log_server
from azureml.telemetry import INSTRUMENTATION_KEY

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType

data_sample = StandardPythonParameterType([1,2])
input_sample = StandardPythonParameterType({'data': data_sample})
method_sample = StandardPythonParameterType("predict")
sample_global_params = StandardPythonParameterType({"method": method_sample})

result_sample = NumpyParameterType(np.array(["example_value"]))
output_sample = StandardPythonParameterType({'Results':result_sample})

try:
    log_server.enable_telemetry(INSTRUMENTATION_KEY)
    log_server.set_verbosity('INFO')
    logger = logging.getLogger('azureml.automl.core.scoring_script_v2')
except:
    pass

def init():
    global model
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
    path = os.path.normpath(model_path)
    path_split = path.split(os.sep)
    log_server.update_custom_dimensions({'model_name': path_split[-3], 'model_version': path_split[-2]})
    try:
        logger.info("Loading model from path.")
        model = joblib.load(model_path)
        logger.info("Loading successful.")
    except Exception as e:
        logging_utilities.log_traceback(e, logger)
        raise

@input_schema('Inputs', input_sample)
@output_schema(output_sample)
def run(Inputs):
    data = Inputs['data']
    result = ["Method proba executed", sum(data)]
    logger.info("Request executed method called")
    if isinstance(result, pd.DataFrame):
        result = result.values
    return {'Results': result.tolist()}

Additionally, you can enable Inferencing data collection and Application Insights diagnostics while creating an endpoint, so you get more about endpoint logs into your application insight.

What you are currently using, AzureLogHandler, is deprecated, but Microsoft supports it until retirement on September 30, 2024.

Microsoft recommends the OpenTelemetry-based Python offering.

You can refer to this GitHub for more about it, and below is the sample execution.

import os
import logging

from opentelemetry._logs import (get_logger_provider, set_logger_provider,)
from opentelemetry.sdk._logs import (LoggerProvider, LoggingHandler,)
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from azure.monitor.opentelemetry.exporter import AzureMonitorLogExporter

logger_provider = LoggerProvider()
set_logger_provider(logger_provider)

exporter = AzureMonitorLogExporter.from_connection_string("InstrumentationKey=cccnmmm....")
get_logger_provider().add_log_record_processor(BatchLogRecordProcessor(exporter, schedule_delay_millis=60000))

handler = LoggingHandler()
logger = logging.getLogger(__name__)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

logger.warning("Hello World! Level-Warning")
logger.info("Hello World! Level-INFO")

Output:

enter image description here

For extra dimension you can add below code.

logger.info("INFO: Debug with properties", extra={"debug": "true"})