I want to have a base config file which is used by other config files to share common config.
E.g if I have one file base.yml
with
foo: 1
bar:
- 2
- 3
And then a second file some_file.yml
with
foo: 2
baz: "baz"
What I'd want to end up with a merged config file with
foo: 2
bar:
- 2
- 3
baz: "baz"
It's easy enough to write a custom loader that handles an !include
tag.
class ConfigLoader(yaml.SafeLoader):
def __init__(self, stream):
super().__init__(stream)
self._base = Path(stream.name).parent
def include(self, node):
file_name = self.construct_scalar(node)
file_path = self._base.joinpath(file_name)
with file_path.open("rt") as fh:
return yaml.load(fh, IncludeLoader)
Then I can parse an !include
tag. So if my file is
inherit:
!include base.yml
foo: 2
baz: "baz"
But now the base config is a mapping. I.e. if I load the the file I'll end up with
config = {'a': [42], 'c': [3.6, [1, 2, 3]], 'include': [{'a': 1, 'b': [1.43, 543.55]}]}
But if I don't make the tag part of a mapping, e.g.
!include base.yml
foo: 2
baz: "baz"
I get an error. yaml.scanner.ScannerError: mapping values are not allowed here
.
But I know that the yaml parser can parse tags without needing a mapping. Because I can do things like
!!python/object:foo.Bar
x: 1.0
y: 3.14
So how do I write a loader and/or structure my YAML file so that I can include another file in my configuration?
In YAML you cannot mix scalars, mapping keys and sequence elements. This is invalid YAML:
- abc
d: e
and so is this
some_file_name
a: b
and that you have that scalar quoted, and provide a tag does of course not change the fact that it is invalid YAML.
As you can already found out, you can trick the loader into returning a dict
instead of
the string (just like the parser already has built in constructors for non-primitive types like datetime.date
).
That this:
!!python/object:foo.Bar
x: 1.0
y: 3.14
works is because the whole mapping is tagged, where you just tag a scalar value.
What also would be invalid syntax:
!include base.yaml
foo: 2
baz: baz
but you could do:
!include
filename: base.yaml
foo: 2
baz: baz
and process the 'filename' key in a special way, or make
the !include
tag an empty key:
!include : base.yaml # : is a valid tag character, so you need the space
foo: 2
baz: baz
I would however look at using merge keys, as merging is essentially what you are trying to do. The following YAML works:
import sys
import ruamel.yaml
from pathlib import Path
yaml_str = """
<<: {x: 42, y: 196, foo: 3}
foo: 2
baz: baz
"""
yaml = ruamel.yaml.YAML(typ='safe')
yaml.default_flow_style = False
data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)
which gives:
baz: baz
foo: 2
x: 42
y: 196
So you should be able to do:
<<: !load base.yaml
foo: 2
baz: baz
and anyone with knowledge of merge keys would know what happens if base.yaml
does include the key foo
with value 3
,
and would also understand:
<<: [!load base.yaml, !load config.yaml]
foo: 2
baz: baz
(As I tend to associate "including" with textual including as in the C preprocessor, I think `!load' might be a more appropriate tag, but that is probably a matter of taste).
To get the merge keys to work, it is probably easiest to just sublass the Constructor
, as merging is done before tag resolving:
import sys
import ruamel.yaml
from ruamel.yaml.nodes import MappingNode, SequenceNode, ScalarNode
from ruamel.yaml.constructor import ConstructorError
from ruamel.yaml.compat import _F
from pathlib import Path
class MyConstructor(ruamel.yaml.constructor.SafeConstructor):
def flatten_mapping(self, node):
# type: (Any) -> Any
"""
This implements the merge key feature http://yaml.org/type/merge.html
by inserting keys from the merge dict/list of dicts if not yet
available in this node
"""
merge = [] # type: List[Any]
index = 0
while index < len(node.value):
key_node, value_node = node.value[index]
if key_node.tag == 'tag:yaml.org,2002:merge':
if merge: # double << key
if self.allow_duplicate_keys:
del node.value[index]
index += 1
continue
args = [
'while constructing a mapping',
node.start_mark,
'found duplicate key "{}"'.format(key_node.value),
key_node.start_mark,
"""
To suppress this check see:
http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys
""",
"""\
Duplicate keys will become an error in future releases, and are errors
by default when using the new API.
""",
]
if self.allow_duplicate_keys is None:
warnings.warn(DuplicateKeyFutureWarning(*args))
else:
raise DuplicateKeyError(*args)
del node.value[index]
if isinstance(value_node, ScalarNode) and value_node.tag == '!load':
file_path = None
try:
if self.loader.reader.stream is not None:
file_path = Path(self.loader.reader.stream.name).parent / value_node.value
except AttributeError:
pass
if file_path is None:
file_path = Path(value_node.value)
# there is a bug in ruamel.yaml<=0.17.20 that prevents
# the use of a Path as argument to compose()
with file_path.open('rb') as fp:
merge.extend(ruamel.yaml.YAML().compose(fp).value)
elif isinstance(value_node, MappingNode):
self.flatten_mapping(value_node)
print('vn0', type(value_node.value), value_node.value)
merge.extend(value_node.value)
elif isinstance(value_node, SequenceNode):
submerge = []
for subnode in value_node.value:
if not isinstance(subnode, MappingNode):
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
_F(
'expected a mapping for merging, but found {subnode_id!s}',
subnode_id=subnode.id,
),
subnode.start_mark,
)
self.flatten_mapping(subnode)
submerge.append(subnode.value)
submerge.reverse()
for value in submerge:
merge.extend(value)
else:
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
_F(
'expected a mapping or list of mappings for merging, '
'but found {value_node_id!s}',
value_node_id=value_node.id,
),
value_node.start_mark,
)
elif key_node.tag == 'tag:yaml.org,2002:value':
key_node.tag = 'tag:yaml.org,2002:str'
index += 1
else:
index += 1
if bool(merge):
node.merge = merge # separate merge keys to be able to update without duplicate
node.value = merge + node.value
yaml = ruamel.yaml.YAML(typ='safe', pure=True)
yaml.default_flow_style = False
yaml.Constructor = MyConstructor
yaml_str = """\
<<: !load base.yaml
foo: 2
baz: baz
"""
data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)
print('---')
file_name = Path('test.yaml')
file_name.write_text("""\
<<: !load base.yaml
bar: 2
baz: baz
""")
data = yaml.load(file_name)
yaml.dump(data, sys.stdout)
this prints:
bar:
- 2
- 3
baz: baz
foo: 2
---
bar: 2
baz: baz
foo: 1
Notes:
open(filename, 'rb')
).IncludeLoader
, it
would have
been possible to provide a full working example with the merge keys (or find out for you that it
doesn't work for some reason)yaml.load()
is an instance method call (import ruamel.yaml; yaml = ruamel.yaml.YAML()
) or calling a function (from ruamel import yaml
). You should not use the latter as it is deprecated.