Search code examples
pythonazurenumpyazure-stream-analyticsazure-container-instances

Error: "Your deployment does not have an associated swagger.json" - ACI deployment on Stream Analytics Job


Latest update: In the current release of the public review link of Stream Analytics Job the ACI container deployment is not supported. Thus, I will close this question until further notice. For more information, follow the GitHub thread posted below.

Note: The problem arises when the Deployment value is an ACI container and not an AKS cluster. With Kubernetes clusters, the Azure ML Service Function is created successfully. Although I want to test my function with an ACI container and not an AKS cluster.

I am trying to create an Azure ML Service Function in Stream Analytics Job service. For this, I am using an already deployed ml model in an Azure Container Instance (a.k.a ACI). However, I get this error:

enter image description here

The link of the issue on GitHub and the related Microsoft document

This error is present despite the existence of the following three factors:

Factor 1: When I use the scoring URL (of the ACI container) to score some values locally (in a Jupyter Notebook) then the scoring is successful.
Factor 2: I already infer the schema of the input data inside my score.py file.
Factor 3: I put the infer-schema[numpy-support] module as a dependency on the environment file.

What am I doing wrong?

ACI container instance is deployed with an authorization (primary) key, plus I infer the schema of the input & output sample in my score.py file. However, the Stream job cannot recognize the swagger file. Since I infer te schema on the score.py file, I read that the swagger.json file it would have been generated automatically.

A sample of my score.py file:

import json
import numpy as np
import os
import itertools
import joblib
from sklearn.ensemble import RandomForestRegressor

from azureml.core.model import Model

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType

def init():

    global model

    # retrieve the path to the model file using the model name
    model_path = Model.get_model_path('<model_name>')
    model = joblib.load(model_path)

input_sample = np.array([["0", 0, 0, 0, 0, 0]])
output_sample = np.array([0])

@input_schema('raw_data', NumpyParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))

def run(raw_data):
    try:

        data = np.array(raw_data)
        result=[]

        for array in data:

            prediction_result=model[array[0]].predict(array[1:].reshape(1,-1))
            result.append(prediction_result.tolist())

        result=list(itertools.chain.from_iterable(result))

        # you can return any data type as long as it is JSON-serializable
        return result

    except Exception as e:
        error = str(e)
        return error

Sample of my env.yml file:

name: project_environment
dependencies:
  - python=3.7.3
  - pip:
    - azureml-defaults
    - inference-schema[numpy-support]
    - joblib
    - numpy
    - scikit-learn==0.20.3

I would appreciate any comments on this issue to solve it.

Key Finding:

I compared the swagger.json file of an AKS cluster and that of an ACI container instance. And the difference between the two swagger files is the key "paths". In AKS the path in the swagger.json is: "paths": { "/api/v1/service/aks-service/":.... etc In ACI the path in the swagger.json is: "paths": { "/":....etc

Part of Swagger.json of AKS cluster:

enter image description here

Part of Swagger.json of ACI cluster:

enter image description here

And I assume that may this is the root of the problem. Maybe Stream Analytics Job Functions cannot recognize the path "/" to auto-generate the function signature for ACI container.


Solution

  • We are first starting off with support for AKS as that is the recommended approach for real time scoring. As this feature is in public preview, we are finalizing some performance benchmarks for models deployed on ACI so that it can be reliably used for dev/test purposes. We should have support for ACI deployment available within the next few weeks.