Search code examples
pythonavro

Incorrect Serialization and Deserialization of Union Types with dataclasses-avroschema


I encountered an issue while using the dataclasses-avroschema package for Avro serialization in Python. When attempting to serialize and deserialize a dataclass with a union type using dataclasses_avroschema, the deserialized object doesn't match the expected type.

from dataclasses_avroschema import AvroModel
from dataclasses import dataclass
import typing

@dataclass
class MessageTypeTwo(AvroModel):
    val: typing.Union[None, str]
    class Meta:
        namespace = "Messages.type.two"

@dataclass
class MessageTypeOne(AvroModel):
    class Meta:
        namespace = "Messages.type.one"

@dataclass
class CoreMessage(AvroModel):
    messageBody: typing.Union[
        MessageTypeOne,
        MessageTypeTwo,
    ]

Serialize and deserialize an instance of CoreMessage with an instance of MessageTypeTwo:

mt2 = MessageTypeTwo(val="val")
core_message = CoreMessage(messageBody=mt2)
serialized = core_message.serialize()
deserialized = CoreMessage.deserialize(serialized)
print(deserialized.messageBody)

Expected Result: The print statement should output MessageTypeTwo(val='val').

Actual Result: The print statement outputs MessageTypeOne().


Solution

  • To resolve the mismatch in deserialization with the dataclasses-avroschema package, I utilize the dacite_config attribute within the Meta class. By setting "strict": True:

    @dataclass
    class CoreMessage(AvroModel):
        messageBody: typing.Union[
            MessageTypeOne,
            MessageTypeTwo,
        ]
        class Meta:
            dacite_config = {
                "strict": True,
            }
    

    For further details, check dacite config