Search code examples
pythonpython-3.6pydantic

Get all required fields of a nested Pydantic model


My nested Pydantic model is defined as follows:

from typing import Optional
from pydantic import BaseModel


class Location(BaseModel):
    city: Optional[str]
    state: str
    country: str


class User(BaseModel):
    id: int
    name: str = "Gandalf"
    age: Optional[int]
    location: Location

I would like to get all required fields for the User model. For the above example, the expected output is ["id", "state", "country"].

Any help greatly appreciated.


Solution

  • Here is a solution with a generator function.


    UPDATE: Pydantic v2

    from collections.abc import Iterator
    from pydantic import BaseModel
    
    
    def required_fields(model: type[BaseModel], recursive: bool = False) -> Iterator[str]:
        for name, field in model.model_fields.items():
            if not field.is_required():
                continue
            t = field.annotation
            if recursive and isinstance(t, type) and issubclass(t, BaseModel):
                yield from required_fields(t, recursive=True)
            else:
                yield name
    

    Important note:

    Starting with version 2.0 Pydantic (thankfully) made its treatment of fields of type T | None (or Optional[T]) consistent with all other types. This has backwards-incompatible implications, particularly regarding what fields are considered required.

    This change is documented here, but the short version is that a field is now considered required, if and only if a default value was explicitly defined for it. (Assigning the Ellipsis/... to it in a model definition still makes it required, but does not constitute setting a default.) This means a field typed Optional[T] will have no implicit default value of None anymore.

    Using your example above, those models have more required fields in Pydantic v2 than in Pydantic v1. In fact, the only non-required field is User.name.

    Demo:

    print(list(required_fields(User, recursive=True)))
    

    Output:

    ['id', 'age', 'city', 'state', 'country']
    

    If you wanted to migrate to v2 and achieve the same behavior with those models, you would have to specify the None default for the relevant fields like this:

    from pydantic import BaseModel
    from typing import Optional
    
    
    class Location(BaseModel):
        city: Optional[str] = None
        state: str
        country: str
    
    
    class User(BaseModel):
        id: int
        name: Optional[str] = "Gandalf"
        age: Optional[int] = None
        location: Location
    

    Using the same generator will now yield the expected output:

    ['id', 'state', 'country']
    

    Original: Pydantic v1

    from collections.abc import Iterator
    from pydantic import BaseModel
    
    
    def required_fields(model: type[BaseModel], recursive: bool = False) -> Iterator[str]:
        for name, field in model.__fields__.items():
            if not field.required:
                continue
            t = field.type_
            if recursive and isinstance(t, type) and issubclass(t, BaseModel):
                yield from required_fields(t, recursive=True)
            else:
                yield name
    

    Using the models you defined in your example, we can demonstrate it like this:

    print(list(required_fields(User, recursive=True)))
    

    Output:

    ['id', 'state', 'country']
    

    UPDATE: Caveat

    Regardless of the Pydantic version, this recursion no longer works, if the type of a field is a union between a model and another type.

    In fact, the expected behavior at that point is not well defined. Say your User model instead looked like this:

    from pydantic import BaseModel
    from typing import Union
    
    
    class Location(BaseModel):
        ...
    
    
    class User(BaseModel):
        ...
        location: Union[Location, int]
    

    Which fields are now required exactly? Technically, you can initialize a User now by just passing an integer for location (e.g. User(..., location=1)), which would mean none of the Location model fields would be required. But if you wanted to assign a Location instance to it, then its required fields would be required for a User as well.

    This may seem like a contrived example, but it is pretty common to have for example unions of different models as field types, which causes the same problem. Example:

    from pydantic import BaseModel
    from typing import Union
    
    
    class LocationA(BaseModel):
        ...
        state: str
        country: str
    
    
    class LocationB(BaseModel):
        ...
        latitude: str
        longitude: str
    
    
    class User(BaseModel):
        ...
        location: Union[LocationA, LocationB]
    

    Again, it is unclear, which sub-fields exactly should be considered required as you defined it in your question. The technical answer given by Pydantic of course is those that are required by either of the sub-models. But it is unclear how this can be properly expressed in a simple list of field names.

    My solution above will therefore simply not even look for nested fields, if it encounters a union, but simply return the location name, unless a default is set.