I'm trying to use ruamel.yaml to modify an AWS CloudFormation template on the fly using python. I added the following code to make the safe_load working with CloudFormation functions such as !Ref
. However, when I dump them out, those values with !Ref (or any other functions) will be wrapped by quotes. CloudFormation is not able to identify that.
See example below:
import sys, json, io, boto3
import ruamel.yaml
def funcparse(loader, node):
node.value = {
ruamel.yaml.ScalarNode: loader.construct_scalar,
ruamel.yaml.SequenceNode: loader.construct_sequence,
ruamel.yaml.MappingNode: loader.construct_mapping,
}[type(node)](node)
node.tag = node.tag.replace(u'!Ref', 'Ref').replace(u'!', u'Fn::')
return dict([ (node.tag, node.value) ])
funcnames = [ 'Ref', 'Base64', 'FindInMap', 'GetAtt', 'GetAZs', 'ImportValue',
'Join', 'Select', 'Split', 'Split', 'Sub', 'And', 'Equals', 'If',
'Not', 'Or' ]
for func in funcnames:
ruamel.yaml.SafeLoader.add_constructor(u'!' + func, funcparse)
txt = open("/space/tmp/a.template","r")
base = ruamel.yaml.safe_load(txt)
base["foo"] = {
"name": "abc",
"Resources": {
"RouteTableId" : "!Ref aaa",
"VpcPeeringConnectionId" : "!Ref bbb",
"yourname": "dfw"
}
}
ruamel.yaml.safe_dump(
base,
sys.stdout,
default_flow_style=False
)
The input file is like this:
foo:
bar: !Ref barr
aa: !Ref bb
The output is like this:
foo:
Resources:
RouteTableId: '!Ref aaa'
VpcPeeringConnectionId: '!Ref bbb'
yourname: dfw
name: abc
Notice the '!Ref VpcRouteTable' is been wrapped by single quotes. This won't be identified by CloudFormation. Is there a way to configure dumper so that the output will be like:
foo:
Resources:
RouteTableId: !Ref aaa
VpcPeeringConnectionId: !Ref bbb
yourname: dfw
name: abc
Other things I have tried:
Essentially you tweak the loader, to load tagged (scalar) objects as if they were mappings, with the tag the key and the value the scalar. But you don't do anything to distinguish the dict
loaded from such a mapping from other dicts loaded from normal mappings, nor do you have any specific code to represent such a mapping to "get the tag back".
When you try to "create" a scalar with a tag, you just make a string starting with an exclamation mark, and that needs to get dumped quoted to distinguish it from real tagged nodes.
What obfuscates this all, is that your example overwrites the loaded data by assigning to base["foo"]
so the only thing you can derive from the safe_load
, and and all your code before that, is that it doesn't throw an exception. I.e. if you leave out the lines starting with base["foo"] = {
your output will look like:
foo:
aa:
Ref: bb
bar:
Ref: barr
And in that Ref: bb
is not distinguishable from a normal dumped dict. If you want to explore this route, then you should make a subclass TagDict(dict)
, and have funcparse
return that subclass, and also add a representer
for that subclass that re-creates the tag from the key and then dumps the value. Once that works (round-trip equals input), you can do:
"RouteTableId" : TagDict('Ref', 'aaa')
If you do that, you should, apart from removing non-used libraries, also change your code to close the file-pointer txt
in your code, as that can lead to problems. You can do this elegantly be using the with
statement:
with open("/space/tmp/a.template","r") as txt:
base = ruamel.yaml.safe_load(txt)
(I also would leave out the "r"
(or put a space before it); and replace txt
with a more appropriate variable name that indicates this is an (input) file pointer).
You also have the entry 'Split'
twice in your funcnames
, which is superfluous.
A more generic solution can be achieved by using a multi-constructor
that matches any tag and having three basic types to cover scalars, mappings and sequences.
import sys
import ruamel.yaml
yaml_str = """\
foo:
scalar: !Ref barr
mapping: !Select
a: !Ref 1
b: !Base64 A413
sequence: !Split
- !Ref baz
- !Split Multi word scalar
"""
class Generic:
def __init__(self, tag, value, style=None):
self._value = value
self._tag = tag
self._style = style
class GenericScalar(Generic):
@classmethod
def to_yaml(self, representer, node):
return representer.represent_scalar(node._tag, node._value)
@staticmethod
def construct(constructor, node):
return constructor.construct_scalar(node)
class GenericMapping(Generic):
@classmethod
def to_yaml(self, representer, node):
return representer.represent_mapping(node._tag, node._value)
@staticmethod
def construct(constructor, node):
return constructor.construct_mapping(node, deep=True)
class GenericSequence(Generic):
@classmethod
def to_yaml(self, representer, node):
return representer.represent_sequence(node._tag, node._value)
@staticmethod
def construct(constructor, node):
return constructor.construct_sequence(node, deep=True)
def default_constructor(constructor, tag_suffix, node):
generic = {
ruamel.yaml.ScalarNode: GenericScalar,
ruamel.yaml.MappingNode: GenericMapping,
ruamel.yaml.SequenceNode: GenericSequence,
}.get(type(node))
if generic is None:
raise NotImplementedError('Node: ' + str(type(node)))
style = getattr(node, 'style', None)
instance = generic.__new__(generic)
yield instance
state = generic.construct(constructor, node)
instance.__init__(tag_suffix, state, style=style)
ruamel.yaml.add_multi_constructor('', default_constructor, Loader=ruamel.yaml.SafeLoader)
yaml = ruamel.yaml.YAML(typ='safe', pure=True)
yaml.default_flow_style = False
yaml.register_class(GenericScalar)
yaml.register_class(GenericMapping)
yaml.register_class(GenericSequence)
base = yaml.load(yaml_str)
base['bar'] = {
'name': 'abc',
'Resources': {
'RouteTableId' : GenericScalar('!Ref', 'aaa'),
'VpcPeeringConnectionId' : GenericScalar('!Ref', 'bbb'),
'yourname': 'dfw',
's' : GenericSequence('!Split', ['a', GenericScalar('!Not', 'b'), 'c']),
}
}
yaml.dump(base, sys.stdout)
which outputs:
bar:
Resources:
RouteTableId: !Ref aaa
VpcPeeringConnectionId: !Ref bbb
s: !Split
- a
- !Not b
- c
yourname: dfw
name: abc
foo:
mapping: !Select
a: !Ref 1
b: !Base64 A413
scalar: !Ref barr
sequence: !Split
- !Ref baz
- !Split Multi word scalar
Please note that sequences and mappings are handled correctly and that they can be created as well. There is however no check that:
GenericMapping
to behave more like dict
, then you probably want it a subclass of dict
(and not of Generic
) and provide the appropriate __init__
(idem for GenericSequence
/list
)When the assignment is changed to something more close to yours:
base["foo"] = {
"name": "abc",
"Resources": {
"RouteTableId" : GenericScalar('!Ref', 'aaa'),
"VpcPeeringConnectionId" : GenericScalar('!Ref', 'bbb'),
"yourname": "dfw"
}
}
the output is:
foo:
Resources:
RouteTableId: !Ref aaa
VpcPeeringConnectionId: !Ref bbb
yourname: dfw
name: abc
which is exactly the output you want.