Search code examples
pythonpyyaml

How to check yaml Schema in multiple levels using Python


If the yaml file is valid, the method yaml.safe_load() will convert the file to a dictionary (in the code is represented by the variable CONFIG).

After the first validation, it should check each key to see if the type matches.

from schema import Use, Schema, SchemaError
import yaml

config_schema = Schema(
    {
        "pipeline": Use(
            str,
            error="Unsupported pipeline name. A string input is expected"
        ),
        "retry_parameters": Use(
            int,
            error="Unsupported retry strategy. An integer input is expected"
        )
    },
    error="A yaml file is expected"
)

CONFIG = """
pipeline: 1
retry_parameters: 'pipe_1'
"""

configuration = yaml.safe_load(CONFIG)

try:
    config_schema.validate(configuration)
    print("Configuration is valid.")
except SchemaError as se:
    for error in se.errors:
        if error:
            print(error)

In the example above, it's raising and printing three errors.

A yaml file is expected
A yaml file is expected
Unsupported retry strategy. A integer input is expected

But in that case I was expecting the following:

Unsupported pipeline name. A string input is expected
Unsupported retry strategy. A integer input is expected

How can I check if the file has a valid yaml format and after that check if each key has the expected type?


Solution

  • Problem using library Schema.

    1. Firstly, in Use the argument indicates the data type to which the value must be converted. Since 1 is easily converted to '1', you don't get the first error message. Accordingly, 'pipe_1' cannot be converted to an int, and this exception appears. To fix this, change Use to And.

    2. Secondly, since the block is try/exept, the first exception is caught, so the second error is not displayed. The first two errors are likely raised by any errors in the first argument (dictionary) of the Schema class. To output all errors, I think it is necessary to check line by line and save the errors.