Search code examples
pythonpython-typingpydantic

Not Required in Pydantic's Base Models


Im trying to accept data from an API and then validate the response structure with a Pydantic base model. However, I have the case where sometimes some fields will not come included in the response, while sometimes they do. The problem is, when I try to validate the structure, Pydantic starts complaining about those fields being "missing" even though they can be missing sometimes. I really don't understand how to define a field as "missible". The docs mention that a field that is just defined as a name and a type is considered this way, but I haven't had any luck

This is a simple example of what I'm trying to accomplish

# Response: {a: 1, b: "abc", c: ["a", "b", "c"]}
response: dict = json.loads(request_response)

# Pydantic Base Model
from pydantic import BaseModel
class Model(BaseModel):
   a: int
   b: str
   c: List[str]
   d: float

# Validating 
Model(**response)

# Return: ValidationError - Missing "d" field

How do I make it so that "d" doesnt cause the validation to throw an error? I have tried to switch "d" to d: Optional[float] and d: Optional[float] = 0.0, but nothing works.

Thanks!


Solution

  • Pydantic v2

    Either a model has a field or it does not. In a sense, a field is always required to have a value on a fully initialized model instance. It is just that a field may have a default value that will be assigned to it, if no value was explicitly provided during initialization. (see Basic Model Usage in the docs)

    The question for you is ultimately: What value should be assigned to field d, if it is not set explicitly? Should it be None or should it be some default float value (like e.g. 0.) or something else? Whatever you choose, you must specify that default value in the model definition and remember to annotate the field with the correct type.

    If you choose a default float like 0. for instance, your type remains the same and you just define the field as d: float = 0.. If you want the default to be of a different type like None, you will need to change the definition to d: float | None = None.

    For the sake of completeness, you may also define a default factory instead of a static value to have the actual value be calculated during initialization.

    Here is a short demo:

    from pydantic import BaseModel, Field, ValidationError
    
    
    class Model(BaseModel):
        a: int
        b: str
        c: list[str]
        d: float | None = None  # equivalent: `d: typing.Optional[float] = None`
        e: float = 0.
        f: float = Field(default_factory=lambda: 420.69)
    
    
    if __name__ == '__main__':
        instance = Model.model_validate({
            "a": 1,
            "b": "abc",
            "c": ["a", "b", "c"],
        })
        print(instance.model_dump_json(indent=4))
    
        try:
            Model.model_validate({
                "a": 1,
                "b": "abc",
                "c": ["a", "b", "c"],
                "d": None,  # fine
                "e": None,  # error
                "f": None,  # error
            })
        except ValidationError as e:
            print(e.json(indent=4))
    

    Output:

    {
        "a": 1,
        "b": "abc",
        "c": [
            "a",
            "b",
            "c"
        ],
        "d": null,
        "e": 0.0,
        "f": 420.69
    }
    
    [
        {
            "type": "float_type",
            "loc": [
                "e"
            ],
            "msg": "Input should be a valid number",
            "input": null,
            "url": "https://errors.pydantic.dev/2.7/v/float_type"
        },
        {
            "type": "float_type",
            "loc": [
                "f"
            ],
            "msg": "Input should be a valid number",
            "input": null,
            "url": "https://errors.pydantic.dev/2.7/v/float_type"
        }
    ]
    

    [Old answer] Pydantic v1

    As @python_user said, both your suggestions work.

    Admittedly, the behavior of typing.Optional for Pydantic fields is poorly documented. Perhaps because it is assumed to be obvious. I personally don't find it obvious because Optional[T] is just equivalent to Union[T, None] (or T | None in the new notation).

    Annotating a field with any other union of types, while omitting a default value will result in the field being required. But if you annotate with a union that includes None, the field automatically receives the None default value. Kind of inconsistent, but that is the way it is.


    Update (2024-04-20): As @Michael mentioned in the comments, with the release of Pydantic v2, the maintainers addressed this exact inconsistency (and arguably "fixed" it). The change is explained in the documentation section on Required Fields. Quote: (emphasis mine)

    In Pydantic V1, fields annotated with Optional or Any would be given an implicit default of None even if no default was explicitly specified. This behavior has changed in Pydantic V2, and there are no longer any type annotations that will result in a field having an implicit default value.

    See my Pydantic v2 answer above for an example.


    However, the question is ultimately what you want your model fields to be. What value should be assigned to field d, if it is not set explicitly? Should it be None or should it be some default float value? (like e.g. 0.) If None is fine, then you can use Optional[float] or float | None, and you don't need to specify the default value. (Specifying Optional[float] = None is equivalent.) If you want any other default value, you'll need to specify it accordingly, e.g. d: float = 0.0; but in that case None would be an invalid value for that field.

    from pydantic import BaseModel, ValidationError
    
    
    class Model(BaseModel):
        a: int
        b: str
        c: list[str]
        d: float | None  # or typing.Optional[float]
        e: float = 0.
    
    
    if __name__ == '__main__':
        print(Model.parse_obj({
            "a": 1,
            "b": "abc",
            "c": ["a", "b", "c"],
        }), "\n")
        try:
            Model.parse_obj({
                "a": 1,
                "b": "abc",
                "c": ["a", "b", "c"],
                "e": None,
            })
        except ValidationError as e:
            print(e)
    

    Output:

    a=1 b='abc' c=['a', 'b', 'c'] d=None e=0.0 
    
    1 validation error for Model
    e
      none is not an allowed value (type=type_error.none.not_allowed)