Search code examples
jsonschemapydanticpython-jsonschema

pydantic model with fields that have dependent allowable values


I have a pydantic model that has fields with dependent allowable values which I would like to be properly represented in the JSON Schema. Think of this as a category / subcategory relationship like:

category  | subcategory
-----------------------
fruit     | apple
fruit     | orange
vegetable | cucumber
vegetable | tomato

My pydantic model looks something like:

class Item(pydantic.BaseModel):
    category: Literal["fruit", "vegetable"]
    subcategory: Literal["apple", "orange", "cucumber", "tomato"]

This is correct in that it is limiting the allowable subcategory fields to not include puzzles or any other unexpected subcategory, but this structure does not properly specify the dependency between categories and their allowable subcategories in the JSON Schema. Is it possible to specify this kind of a relationship in a JSON Schema? Presuming it is possible, what's an appropriate way to accomplish this with pydantic?

I am aware of pydantic validators but this only checks the incoming data at the time the model is instantiated. In my use case, I want to be able to allow users to select valid category & subcategory from a UI and therefore the UI needs to know about the dependency between these different fields (ideally) through the JSON Schema.


Solution

  • You can achieve this by creating two separate models which implement this dependency using Literals.

    Let's create the models:

    from pydantic import BaseModel
    from typing import Literal
    
    
    class Item(BaseModel):
        category: str
        subcategory: str
    
    
    class Fruit(Item):
        category: Literal['fruit']
        subcategory: Literal['apple', 'orange']
    
    
    class Vegetable(Item):
        category: Literal['vegetable']
        subcategory: Literal['cucumber', 'tomato']
    
    
    class Foo(BaseModel):
        item: Fruit | Vegetable
    

    The key here is that we have a model similar too Foo where the field type is a union of all possible types. Pydantic will then try to find the suitable match. Since we used Literals, the dependency needs to be fulfilled.

    Here is a test:

    Foo(item ={'category': 'fruit', 'subcategory': 'apple'})
    # Foo(item=Fruit(category='fruit', subcategory='apple'))
    
    Foo(item={'category': 'fruit', 'subcategory': 'tomato'})
    # ValidationError: 2 validation errors for Foo
    # item.Fruit.subcategory
    #   Input should be 'apple' or 'orange' [type=literal_error, input_value='tomato', input_type=str]
    #     For further information visit https://errors.pydantic.dev/2.6/v/literal_error
    # item.Vegetable.category
    #   Input should be 'vegetable' [type=literal_error, input_value='fruit', input_type=str]
    #     For further information visit https://errors.pydantic.dev/2.6/v/literal_error