Search code examples
pythonpyyamlruamel.yaml

yield in method without loop


I am studying ruamel.yaml API using source code and found this strange pattern in code:

    def construct_yaml_map(self, node):
        # type: (Any) -> Any
        data = CommentedMap()
        data._yaml_set_line_col(node.start_mark.line, node.start_mark.column)
        yield data
        self.construct_mapping(node, data, deep=True)
        self.set_collection_style(data, node)

which then makes it necessary to write code like this (inspired by the code in another place)

generator = yaml.constructor.construct_yaml_map(mapping_node)
# method does not run, only returns generator

data = next(generator)
# method executes until yield data (empty CommentedMap)

for _dummy in generator:
  # on first iteration method continues until end
  # no second iteration
  pass

# or alternative expression
#try:
#  next(generator)
#except StopIteration:
#  pass

# now data is filled

What could be the use of this yield in the middle of the method without any loop?


Solution

  • Python allows you to create recursive data structures, and contrary to many other streaming formats, YAML allows you to dump these

    import sys
    import ruamel.yaml
    
    data = dict(a=1)
    data['b'] = data
    
    yaml = ruamel.yaml.YAML()
    yaml.dump(data, sys.stdout)
    

    which gives:

    &id001
    a: 1
    b: *id001
    

    In this the &id001 is an anchor in YAML speak, and it is referred to by *id001, called an alias.

    I don't know of a way to create data in Python without using more than one statements. You need the object (or its id) to be able to add it to itself. The YAML loader has as simimlar problem: when it parses and constructs a composite (a mapping or a sequence), it gathers all the children (key/value pairs resp. elements) and then constructs the composite. So when a composite in YAML refers to itself (through an alias) the lookup for the anchor needs to provide a real, albeit incomplete, constructed Python object (dict, list, and in case of a tagged object possibly an instance of some class).

    This is why the two-step process of creating composites is used: you make an empty object which you hand back so if necessary a reference in the anchor/alias table can be made, and then you fill it in, in the post yield part of the code.

    Composites don't need to have an alias themselves in one of their values or elements, but if they don't have an anchor they could be constructed in a one step process. But this would lead to having different constructors and the code calling construct_yaml_map() to understand about these differently constructed objects. The problem shifts, but doesn't go away, just like you can create the recursive Python differently, but still need multiple statements.

    BTW this is not ruamel.yaml specific. This code was, and AFAIK still is, in the PyYAML code from which ruamel.yaml was derived.