Search code examples
yamlpyyaml

How to include another YAML as a global base file using PyYAML


I am using YAML as a configuration engine, and I need to use it in such as way that I can split the configurations in a hierarchical way, say I have a base.yaml which contains some default values and then a overridden.yaml file which overrides the base values:

base.yaml

value: base

overridden.yaml

!include base.yaml

value: overridden

In the end, if I load the overridden.yaml file, ideally I want to see the value set as "overridden".

I can use this trick to include another YAML file, but still one piece is missing is it does not allow me to define more entries right after the global include with this error:

yaml.scanner.ScannerError: mapping values are not allowed here

I am using PyYAML to load the YAML files.


Solution

  • The reason this is not working is because you your !include creates a single node, because a YAML tag applies to a single node. This would be the same as having your overridden.yaml to look like:

    "some string"
    value: overridden
    

    which is not valid YAML either.

    So it is not like the !include would insert the textual version and then process the whole. You could do that with a preprocessor and/or template language, but then you end up with an invalid YAML file ecause keys in mappings have to be unique according to the YAML 1.2 standard, and even the outdated one that PyYAML supports. (This however doesn't stop PyYAML from loading that without even a warning)

    What you IMO should look at is using the language indepedent merge feature in combination with the include. Your base.yaml can look the same and your overridden.yaml can then look like:

    - &base !include base.yaml
    - <<: *base
      value: overridden