Recently during edit of a bit complex yaml config i need to do a bit tricky yaml merge key operation and i noticed that my favorite tool ruamel.yaml produce illogical results.
I know that merge keys is deprecated, but as soon as 1.3 specs are not released, i have to keep using it.
I filled ticket
, but author set it as invalid and stated that i misunderstand yaml.
Here is example of yaml code to test merge:
tag1: &tag1
subtag1:
subsubtag1:
subsubtag2:
ssstag31:
- var1
- var2
ssstag32:
- var1
- var2
tag2:
<<: *tag1
subtag1:
subsubtag2:
ssstag31:
- var3
- var4
I expect that first it will merge tag1 anchor to tag2, then replace subtag1 by new data. So tag2 will look like this
tag2:
subtag1:
subsubtag2:
ssstag31:
- var3
- var4
ruamel.yaml unfortunately does merge, but doesn't replace data, so tag2 is identical to tag1.
It is easy to test it by trivial python program which produce results i expect.
import yaml
class NoAliasDumper(yaml.SafeDumper):
def ignore_aliases(self, data):
return True
with open("example.yaml") as f:
y = yaml.safe_load(f)
with open(r'merged.yaml', 'w') as file:
yaml.dump(y, file, Dumper=NoAliasDumper)
Please advise where I went wrong if python does the right merge and the ruamel.yaml doesn't. What is correct results of merge? As it means bug is either in python yaml or in ruamel.yaml
P.S. By the way, it's funny to check this snippet in online utilities that deal with it with varying degrees of success.
I am not sure what you mean by "ruamel.yaml unfortunately does merge tag1 anchor to tag2".
import sys
import ruamel.yaml
from pathlib import Path
file_in = Path('expand.yaml')
yaml = ruamel.yaml.YAML()
data = yaml.load(file_in)
yaml.dump(data, sys.stdout)
this gives exactly the original input:
tag1: &tag1
subtag1: 42
subtag2: baz
tag2:
<<: *tag1
subtag1: 18
subtag3: *tag1
So it preserves both the aliases and merge key. (I am using a smaller example than yours, but more complete in that not all the keys of the merge are "covered" by other keys and that the anchor is still referenced if the merge key is removed).
You can ignore aliases in ruamel.yaml, but the effect is not really useful.
yaml = ruamel.yaml.YAML()
yaml.representer.ignore_aliases = lambda x: True
data = yaml.load(file_in)
yaml.dump(data, sys.stdout)
which gives:
tag1:
subtag1: 42
subtag2: baz
tag2:
<<:
subtag1: 42
subtag2: baz
subtag1: 18
subtag3:
subtag1: 42
subtag2: baz
IIRC the merge-epxand
option of the yaml
utility (as provided by the package ruamel.yaml.cmd
) was made before the ruamel.yaml
package
could preserve merges. That options relies on the mapping_flattener of the SafeLoader
(the RoundTripLoader
's doesn't
flatten in order to not loose the merge key information). But either the improvements on the PyYAML original
(which handles duplicate keys incorrectly), or the interaction between the aliases and merge keys caused that to not function
properly.
Unfortunately you cannot use PyYAML's flatten_mapping, as it errors with the less than useful message:
expected a mapping or list of mappings for merging, but found mapping
But you can do:
import sys
import ruamel.yaml
from pathlib import Path
def flatten_mapping(self, node):
merge = []
index = 0
while index < len(node.value):
key_node, value_node = node.value[index]
if key_node.tag == 'tag:yaml.org,2002:merge':
del node.value[index]
if isinstance(value_node, ruamel.yaml.nodes.MappingNode):
self.flatten_mapping(value_node)
merge.extend(value_node.value)
elif isinstance(value_node, ruamel.yaml.nodes.SequenceNode):
submerge = []
for subnode in value_node.value:
if not isinstance(subnode, ruamel.yaml.nodes.MappingNode):
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
f'expected a mapping for merging, but found {subnode.id!s}',
subnode.start_mark,
)
self.flatten_mapping(subnode)
submerge.append(subnode.value)
submerge.reverse()
for value in submerge:
merge.extend(value)
else:
raise ConstructorError(
'while constructing a mapping',
node.start_mark,
'expected a mapping or list of mappings for merging, '
f'but found {value_node.id!s}',
value_node.start_mark,
)
elif key_node.tag == 'tag:yaml.org,2002:value':
key_node.tag = 'tag:yaml.org,2002:str'
index += 1
else:
index += 1
if bool(merge):
values = [k[0].value for k in node.value]
for k in merge:
if k[0].value in values:
continue
node.value.append(k)
file_in = Path('expand.yaml')
yaml = ruamel.yaml.YAML()
# Using PyYAML's flattener doesn't work
# import yaml as pyyaml
# yaml.Constructor.flatten_mapping = pyyaml.constructor.SafeConstructor.flatten_mapping
yaml.Constructor.flatten_mapping = flatten_mapping
# uncomment next line if you don't want aliases
# yaml.representer.ignore_aliases = lambda x: True
data = yaml.load(file_in)
yaml.dump(data, sys.stdout)
which I think gives what you want:
tag1: &tag1
subtag1: 42
subtag2: baz
tag2:
subtag1: 18
subtag3: *tag1
subtag2: baz
So your ticket was not invalid.