Search code examples
pythonjsonpydantic

Python/Pydantic - using a list with json objects


I have a working model to receive a json data set using pydantic. The model data set looks like this:

data = {'thing_number': 123, 
        'thing_description': 'duck',
        'thing_amount': 4.56}

What I would like to do is have a list of json files as the data set and be able to validate them. Ultimately the list will be converted to records in pandas for further processing. My goal is to validate an arbitrarily long list of json entries that looks something like this:

bigger_data = [{'thing_number': 123, 
                'thing_description': 'duck',
                'thing_amount': 4.56}, 
               {'thing_number': 456, 
                'thing_description': 'cow',
                'thing_amount': 7.89}]

The basic setup I have now is as follows. Note that adding the class ItemList is part of the attempt to get the arbitrary length to work.

from typing import List
from pydantic import BaseModel
from pydantic.schema import schema
import json

class Item(BaseModel):
    thing_number: int
    thing_description: str
    thing_amount: float

class ItemList(BaseModel):
    each_item: List[Item]                                                                           

The basic code will then produce what I think I'm looking for in an array object that will take Item objects.

item_schema = schema([ItemList])
print(json.dumps(item_schema, indent=2)) 

    {
      "definitions": {
        "Item": {
          "title": "Item",
          "type": "object",
          "properties": {
            "thing_number": {
              "title": "Thing_Number",
              "type": "integer"
            },
            "thing_description": {
              "title": "Thing_Description",
              "type": "string"
            },
            "thing_amount": {
              "title": "Thing_Amount",
              "type": "number"
            }
          },
          "required": [
            "thing_number",
            "thing_description",
            "thing_amount"
          ]
        },
        "ItemList": {
          "title": "ItemList",
          "type": "object",
          "properties": {
            "each_item": {
              "title": "Each_Item",
              "type": "array",
              "items": {
                "$ref": "#/definitions/Item"
              }
            }
          },
          "required": [
            "each_item"
          ]
        }
      }
    }

The setup works on a singe json item being passed:

item = Item(**data)                                                      

print(item)

Item thing_number=123 thing_description='duck' thing_amount=4.56

But when I try and pass the single item into the ItemList model it returns an error:

item_list = ItemList(**data)

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
<ipython-input-94-48efd56e7b6c> in <module>
----> 1 item_list = ItemList(**data)

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()

ValidationError: 1 validation error for ItemList
each_item
  field required (type=value_error.missing)

I've also tried passing bigger_data into the array thinking that it would need to start as a list. that also returns an error - - Although, I at least have a better understanding of the dictionary error I can't figure out how to resolve.

item_list2 = ItemList(**data_big)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-100-8fe9a5414bd6> in <module>
----> 1 item_list2 = ItemList(**data_big)

TypeError: MetaModel object argument after ** must be a mapping, not list

Thanks.

Other Things I've Tried

I've tried passing the data into the specific key with a little more luck (maybe?).

item_list2 = ItemList(each_item=data_big)

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
<ipython-input-111-07e5c12bf8b4> in <module>
----> 1 item_list2 = ItemList(each_item=data_big)

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()

/opt/conda/lib/python3.7/site-packages/pydantic/main.cpython-37m-x86_64-linux-gnu.so in pydantic.main.validate_model()

ValidationError: 6 validation errors for ItemList
each_item -> 0 -> thing_number
  field required (type=value_error.missing)
each_item -> 0 -> thing_description
  field required (type=value_error.missing)
each_item -> 0 -> thing_amount
  field required (type=value_error.missing)
each_item -> 1 -> thing_number
  field required (type=value_error.missing)
each_item -> 1 -> thing_description
  field required (type=value_error.missing)
each_item -> 1 -> thing_amount
  field required (type=value_error.missing)

Solution

  • from typing import List
    from pydantic import BaseModel
    import json
    
    
    class Item(BaseModel):
        thing_number: int
        thing_description: str
        thing_amount: float
    
    
    class ItemList(BaseModel):
        each_item: List[Item]
    

    Base on your code with each_item as a List of Item

    a_duck = Item(thing_number=123, thing_description="duck", thing_amount=4.56)
    print(a_duck.json())
    
    a_list = ItemList(each_item=[a_duck])
    
    print(a_list.json())
    

    Generate the following output:

    {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
    {"each_item": [{"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}]}
    

    using these as "entry json":

    a_json_duck = {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
    a_json_list = {
        "each_item": [
            {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56}
        ]
    }
    
    print(Item(**a_json_duck))
    print(ItemList(**a_json_list))
    

    Work just fine and generates:

    Item thing_number=123 thing_description='duck' thing_amount=4.56
    ItemList each_item=[<Item thing_number=123 thing_description='duck' thing_amount=4.56>]
    

    We are just left with the only datas:

    just_datas = [
        {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
        {"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89},
    ]
    item_list = ItemList(each_item=just_datas)
    print(item_list)
    print(type(item_list.each_item[1]))
    print(item_list.each_item[1])
    

    Those works as expected:

    ItemList each_item=[<Item thing_number=123 thing_description='duck'thing_amount=4.56>,<Item thin…
    <class '__main__.Item'>
    Item thing_number=456 thing_description='cow' thing_amount=7.89
    

    So in case i'm missing something the pydantic librairy works as expected.

    My pydantic version : 0.30 python 3.7.4

    Reading from a lookalike file:

    json_data_file = """[
    {"thing_number": 123, "thing_description": "duck", "thing_amount": 4.56},
    {"thing_number": 456, "thing_description": "cow", "thing_amount": 7.89}]"""
    
    from io import StringIO
    item_list2 = ItemList(each_item=json.load(StringIO(json_data_file)))
    

    Work also fine.