Search code examples
pythonjsonstringtypeerrormlflow

MLflow: ModelSignature.from_dict() throws TypeError: string indices must be integers


I'm trying to create a ModelSignature in MLflow using the ModelSignature.from_dict() function. This is my code:

from mlflow.models import ModelSignature

signature_dict = {
    "inputs": '{"messages": [{"content": "string","role": "string"}]}',
    "outputs": """{
        "question": "string",
        "history": [{"content": "string", "role": "string"}],
        "context": [
            {"metadata": {"url": "string", "id": 0.0}, "page_content": "string"}
        ],
        "answer": "string"
    }""",
    "params": "",
}

model_signature = ModelSignature.from_dict(signature_dict)

When I try to run the code it throws the following error:

TypeError: string indices must be integers

I've done quite some googling and testing, but since I can't get it to work, I seem to be missing something obvious.

EDIT: Here's the full stacktrace:

TypeError: string indices must be integers
File <command-2446400647160771>, line 16
      1 from mlflow.models import ModelSignature
      3 signature_dict = {
      4     "inputs": '{"messages": [{"content": "string","role": "string"}]}',
      5     "outputs": """{
   (...)
     13     "params": "",
     14 }
---> 16 model_signature = ModelSignature.from_dict(signature_dict)
     17 model_signature
File /databricks/python/lib/python3.10/site-packages/mlflow/models/signature.py:111, in ModelSignature.from_dict(cls, signature_dict)
     98 @classmethod
     99 def from_dict(cls, signature_dict: Dict[str, Any]):
    100     """
    101     Deserialize from dictionary representation.
    102 
   (...)
    109     :return: ModelSignature populated with the data form the dictionary.
    110     """
--> 111     inputs = Schema.from_json(x) if (x := signature_dict.get("inputs")) else None
    112     outputs = Schema.from_json(x) if (x := signature_dict.get("outputs")) else None
    113     params = ParamSchema.from_json(x) if (x := signature_dict.get("params")) else None
File /databricks/python/lib/python3.10/site-packages/mlflow/types/schema.py:465, in Schema.from_json(cls, json_str)
    462 def read_input(x: dict):
    463     return TensorSpec.from_json_dict(**x) if x["type"] == "tensor" else ColSpec(**x)
--> 465 return cls([read_input(x) for x in json.loads(json_str)])
File /databricks/python/lib/python3.10/site-packages/mlflow/types/schema.py:465, in <listcomp>(.0)
    462 def read_input(x: dict):
    463     return TensorSpec.from_json_dict(**x) if x["type"] == "tensor" else ColSpec(**x)
--> 465 return cls([read_input(x) for x in json.loads(json_str)])
File /databricks/python/lib/python3.10/site-packages/mlflow/types/schema.py:463, in Schema.from_json.<locals>.read_input(x)
    462 def read_input(x: dict):
--> 463     return TensorSpec.from_json_dict(**x) if x["type"] == "tensor" else ColSpec(**x)

Solution

  • To create a ModelSignature object from a dictionary in MLflow, you need to ensure that the dictionary is formatted according to MLflow's expected schema structure. The ModelSignature class expects the inputs and outputs to be instances of mlflow.types.Schema, which are composed of mlflow.types.ColSpec objects.

    Here's how you can adjust your dictionary and create a ModelSignature:

    1. Define the input and output schemas using mlflow.types.Schema and mlflow.types.ColSpec.
    2. Create the ModelSignature using these schemas.

    Here’s an example:

    from mlflow.models import ModelSignature
    from mlflow.types import Schema, ColSpec
    
    # Define the input schema
    input_schema = Schema([
        ColSpec("string", "messages.content"),
        ColSpec("string", "messages.role")
    ])
    
    # Define the output schema
    output_schema = Schema([
        ColSpec("string", "question"),
        ColSpec("string", "history.content"),
        ColSpec("string", "history.role"),
        ColSpec("string", "context.metadata.url"),
        ColSpec("double", "context.metadata.id"),
        ColSpec("string", "context.page_content"),
        ColSpec("string", "answer")
    ])
    
    # Create the ModelSignature
    model_signature = ModelSignature(inputs=input_schema, outputs=output_schema)
    
    

    By defining the schemas this way, you ensure that your ModelSignature is correctly structured and compatible with MLflow.