Search code examples
pythonyamlpydantic

Pydantic - parse a list of objects from YAML configuration file


I want to read a list of objects from a YAML file:

- entry1:
   attribute: "Test1"
   amount: 1
   price: 123.45
- entry2:
   attribute: "Test1"
   amount: 10
   price: 56.78

For this data structure i created three nested models as follows:

# Models
class EntryValues(BaseModel):
    attribute: str
    amount: int
    price: float

class Entry(BaseModel):
    entry1: EntryValues
    entry2: EntryValues
class Config(BaseModel):
    __root__: list[Entry]

My code to read the YAML config file looks as follows:

# get YAML config path
def get_cfg_path() -> Path:
    return CWD

# read YAML file
def read_cfg(file_name: str, file_path: Path = None) -> YAML:
    if not file_path:
        file_path = get_cfg_path()

    if file_path:
        try:
            file = open(file_path / file_name, "r")
        except Exception as e:
            print(f"open file {file_name} failed", e)
            sys.exit(1)
        else:
            return load(file.read())
    else:
        raise Exception(f"Config file {file_name} not found!")

Now i want to unpack the values of the YAML to my model. For that i tried to unpack the values with the ** operator. I think im missing one more loop here though, but i can not get it work.

# Unpack and create config file
def create_cfg(file_name: str = None) -> Config:
    config_file = read_cfg(file_name=file_name)
    _config = Config(**config_file.data)
    return _config

I would appreciate any help.

Update

So i played around with my model-structure a bit without using the YAML file. I dont quite get why the following throws an ValidationError:

Consider the following list of entries (thats the same data structure i would receive from my YAML file):

entries = [
    {'entry1': {'attribute': 'Test1', 'amount': 1, 'price': 123.45}}, 
    {'entry2': {'attribute': 'Test2', 'amount': 10, 'price': 56.78}}
]

If i run the following simple loop, then Pydantic throws an ValidationError:

for entry in entries:
    Entry(**entry)

Error:

ValidationError: 1 validation error for Entry
entry2
  field required (type=value_error.missing)

However, if the list only contains one entry dictionary, then it works:

class Entry(BaseModel):
    entry1: EntryValues
    #entry2: EntryValues

entries = [
    {'entry1': {'attribute': 'Test1', 'amount': 1, 'price': 123.45}}
]

for entry in entries:
    Entry(**entry)

Can someone explain this or what im doing wrong here?


Solution

  • In your update, the reason that the second case works but not the first is that the unpacking operator (**) takes a single dictionary object which contains all the necessary keys. In your first case, you had one dictionary with all the necessary info; in the second it is spread across two dicts and they can't be unpacked together. One possible workaround would be to merge them into a single dictionary. But as far as I understand, a better solution would be to just change your YAML to provide this in the first place, by deleting the first two characters in each line:

    entry1:
     attribute: "Test1"
     amount: 1
     price: 123.45
    entry2:
     attribute: "Test1"
     amount: 10
     price: 56.78
    

    and then:

    _config = Config(__root__=[Entry(**entries)])
    

    Original answer:

    There are a number of issues with your code, but I think what you're trying to do is parse the YAML into a dictionary and instantiate an EntryValues from each item. That would look something like this:

    from pydantic import BaseModel
    from pathlib import Path
    from typing import List
    
    import yaml
    
    
    def create_cfg(file_name: str = None) -> Config:
        config_file = read_cfg(file_name=file_name)
        entries = yaml.safe_load(config_file)
        _config = [
            EntryValues(**di[name]) for di, name in zip(entries, ["entry1", "entry2"])
        ]
        return _config