Search code examples
pythonpython-typingpydantic

pydantic: parse list of models with relationship


I have this models:

class Mailbox:
    def get_mails(self) -> List[Mail]:
        pass


class Mail(BaseModel):
    id: int
    attachments: List[Attachment]

    mailbox: Mailbox


class Attachment(BaseModel):
    filename: str
    size: int

    mail: Mail

And there is this list with dicts(mails):

[
    {
        "id": 1,
        "attachments": [
            {"filename": "some.txt", "size": 128},
            {"filename": "some2.txt", "size": 256}
        ]
    },
    {
        "id": 2,
        "attachments": [
            {"filename": "some.txt", "size": 64}
        ]
    }
]

How make a list of Mail from this list of dicts?

I tried to do this, but I got 2 problems:

  1. the parse_obj_as function (for such parsing) is deprecated, the documentation says to use TypeAdapter, but I could not figure it out

  2. I am unable to do this due to the relationships of the models

Full code with models: https://pastebin.com/qdmNAk0S


Solution

  • The main challenge here is the circular dependency between Mail and Attachment. No validator alone can solve this because they are all stateless/class methods with no access to the model instance being initialized.

    You can hack around that by defining a WrapValidator for the Attachment in the attachments list of Mail that checks if it got some sort of mapping and simply returns (a dict of) it without actually parsing/validating.

    Then you can use Mail.model_post_init to pick those dictionaries up, set their mail key to the fully initialized Mail instance and parse them into real Attachment objects.

    You can solve the annotation issue by using a forward reference.

    Lastly, TypeAdapter is very simple to use. Just pass it the type you want to validate against. The resulting object has a validate_python method to parse whatever data fits the specified schema.

    Here is a working example:

    from collections.abc import Mapping
    from typing import Any, Callable, List
    from typing_extensions import Annotated
    
    from pydantic import BaseModel, TypeAdapter, WrapValidator
    
    
    def allow_unvalidated_dict(v: Any, handler: Callable[[Any], Any]) -> Any:
        if isinstance(v, Mapping):
            return dict(v)
        return handler(v)
    
    
    class Mail(BaseModel):
        id: int
        attachments: List[
            Annotated["Attachment", WrapValidator(allow_unvalidated_dict)]
        ]
    
        def model_post_init(self, __context: object) -> None:
            for idx, attachment in enumerate(self.attachments):
                if isinstance(attachment, dict) and attachment.get("mail") is None:
                    attachment["mail"] = self
                    self.attachments[idx] = Attachment.model_validate(attachment)
    
    
    class Attachment(BaseModel):
        filename: str
        size: int
        mail: Mail
    
    
    MailList = TypeAdapter(List[Mail])
    

    Demo:

    test_data = [
        {
            "id": 1,
            "attachments": [
                {"filename": "some.txt", "size": 128},
                {"filename": "some2.txt", "size": 256}
            ]
        },
        {
            "id": 2,
            "attachments": [
                {"filename": "some.txt", "size": 64}
            ]
        }
    ]
    
    validated = MailList.validate_python(test_data)
    for mail in validated:
        print(mail)
    

    Output:

    id=1 attachments=[Attachment(filename='some.txt', size=128, mail=Mail(id=1, attachments=[...])), Attachment(filename='some2.txt', size=256, mail=Mail(id=1, attachments=[...]))]
    id=2 attachments=[Attachment(filename='some.txt', size=64, mail=Mail(id=2, attachments=[...]))]
    

    Of course you could completely omit the WrapValidator, if you just allowed None for Attachment.mail and set it as a default. But I understand that this may be undesirable.