Search code examples
pythonpydanticabc

Distinguishing between Pydantic Models with same fields


I'm using Pydantic to define hierarchical data in which there are models with identical attributes.

However, when I save and load these models, Pydantic can no longer distinguish which model was used and picks the first one in the field type annotation.

I understand that this is expected behavior based on the documentation. However, the class type information is important to my application.

What is the recommended way to distinguish between different classes in Pydantic? One hack is to simply add an extraneous field to one of the models, but I'd like to find a more elegant solution.

See the simplified example below: container is initialized with data of type DataB, but after exporting and loading, the new container has data of type DataA as it's the first element in the type declaration of container.data.

Thanks for your help!

from abc import ABC
from pydantic import BaseModel #pydantic 1.8.2
from typing import Union

class Data(BaseModel, ABC):
    """ base class for a Member """
    number: float

class DataA(Data):
    """ A type of Data"""
    pass

class DataB(Data):
    """ Another type of Data """
    pass

class Container(BaseModel):
    """ container holds a subclass of Data """
    data: Union[DataA, DataB]

# initialize container with DataB
data = DataB(number=1.0)
container = Container(data=data)

# export container to string and load new container from string
string = container.json()
new_container = Container.parse_raw(string)

# look at type of container.data
print(type(new_container.data).__name__)
# >>> DataA

Solution

  • As correctly noted in the comments, without storing additional information models cannot be distinguished when parsing.

    As of today (pydantic v1.8.2), the most canonical way to distinguish models when parsing in a Union (in case of ambiguity) is to explicitly add a type specifier Literal. It will look like this:

    from abc import ABC
    from pydantic import BaseModel
    from typing import Union, Literal
    
    class Data(BaseModel, ABC):
        """ base class for a Member """
        number: float
    
    
    class DataA(Data):
        """ A type of Data"""
        tag: Literal['A'] = 'A'
    
    
    class DataB(Data):
        """ Another type of Data """
        tag: Literal['B'] = 'B'
    
    
    class Container(BaseModel):
        """ container holds a subclass of Data """
        data: Union[DataA, DataB]
    
    
    # initialize container with DataB
    data = DataB(number=1.0)
    container = Container(data=data)
    
    # export container to string and load new container from string
    string = container.json()
    new_container = Container.parse_raw(string)
    
    
    # look at type of container.data
    print(type(new_container.data).__name__)
    # >>> DataB
    

    This method can be automated, but you can use it at your own responsibility, since it breaks static typing and uses objects that may change in future versions:

    from pydantic.fields import ModelField
    
    class Data(BaseModel, ABC):
        """ base class for a Member """
        number: float
    
        def __init_subclass__(cls, **kwargs):
            name = 'tag'
            value = cls.__name__
            annotation = Literal[value]
    
            tag_field = ModelField.infer(name=name, value=value, annotation=annotation, class_validators=None, config=cls.__config__)
            cls.__fields__[name] = tag_field
            cls.__annotations__[name] = annotation
    
    
    class DataA(Data):
        """ A type of Data"""
        pass
    
    
    class DataB(Data):
        """ Another type of Data """
        pass