I am working on a project to parse an AWS Cloudformation Yaml File to extract all the !ImportValue from the YAML template.
I am trying to use ruamel.yaml to parse that (to which I am new), I was able to read the YAML file and get the individual elements.
import ruamel.yaml
def general_constructor(loader, tag_suffix, node):
return node.value
ruamel.yaml.SafeLoader.add_multi_constructor(u'!', general_constructor)
with open(cfFile, 'r') as service:
stream = service.read()
yaml_data = ruamel.yaml.safe_load(stream)
print yaml_data
Above code gets the content of specified YAML file and the output looks like following.
{'Application': {'Properties': {'ApplicationName': [ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'-'),
SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'***'), ScalarNode(tag=u'!ImportValue', value=u'jkl')])],
*
*
ScalarNode(tag=u'!ImportValue', value=u'def'),
*
*
ScalarNode(tag=u'!ImportValue', value=u'rst')])]},
So there are bunch of !ImportValue listed in ScalarNode (e.g ScalarNode(tag=u'!ImportValue', value=u'rst')), I actually want to extract that. Now these ImportValues are scattered in the template at various places. What would be the best way to extract the Value of those? In our cloudformation, we have bunch of YAML files, some of them Exports certain resource and other YAML files import them. So, I want to build a sort of dependency map (May be a JSON file) which will depict the interdependence between Cloud-formation files.
If you use ruamel.yaml
's round-trip loader you don't have to do
anything special to load the tag, and walking recursively over the
resulting data structure is relatively easy. The corresponding key
needs to be passed on, as at least the first !ImportValue
is within
a sequence under the key.
Assuming an input.yaml
consisting of:
Application:
Properties:
ApplicationName: ["-", ["**", !ImportValue "jkl"]]
AnotherKey:
- 42
- nested: !ImportValue xyz
(which might not be exactly what you got as input, but will do for
demonstration purposes), and using the new ruamel.yaml
API (which
defaults to round-trip loading/dumping):
import sys
from pathlib import Path
import ruamel.yaml
ta = ruamel.yaml.comments.Tag.attrib
yaml = ruamel.yaml.YAML()
data = yaml.load(Path('input.yaml'))
def process(d, key=None):
if isinstance(d, dict):
for k, v in d.items():
for res in process(v, k): # recurse and pass on new key
yield res
elif isinstance(d, list):
for item in d:
for res in process(item, key):
yield res
else:
try:
if getattr(d, ta, None).value == '!ImportValue':
yield (key, d)
except AttributeError:
pass
for k, v in process(data):
print(k, '->', v)
which gives:
ApplicationName -> jkl
nested -> xyz