Search code examples
pythonpyyaml

Pyyaml nested objects


I want to interpret following YAML string with PyYaml:

      - !Table
        header:
          - !Column
            - !Paragraph
              text: 'header1'
          - !Column 
            - !Paragraph
              text: 'header2'

I have Table and Paragraph classes that are inherited from yaml.YAMLObject class. But I don't know what to do about !Column tag? This tag should be treated only as a named array tag.

When I want to build objects from yaml.load() function, I got following error:

yaml.constructor.ConstructorError: could not determine a constructor 
for the tag '!Column'
in "<unicode string>", line 19, column 17:
              - !Column
                ^

Solution

  • There are two problems with using yaml.YAMLObject:

    • it is not transparent to use anything but the default yaml.Loader forcing you to use the unsafe yaml.load() which you should not do unless you have full control over the input, now and in the future.
    • you cannot use it for objects that are a node consisting of a scalar or sequence. Only a mapping can be used.

    Your !Table and !Paragraph are mapping nodes. But your !Column is a sequence, and you explicitly need to make a constructor for that:

    import sys
    from ruamel import yaml
    
    yaml_str = """\
    - !Table
      header:
        - !Column
          - !Paragraph
            text: 'header1'
        - !Column
          - !Paragraph
            text: 'header2'
    
    """
    
    class Table(yaml.YAMLObject):
        yaml_tag = u'!Table'
        pass
    
    
    class Paragraph(yaml.YAMLObject):
        yaml_tag = u'!Paragraph'
        pass
    
    def column_constructor(loader, node):
        return loader.construct_sequence(node)
    
    yaml.add_constructor('!Column', column_constructor)
    
    
    data = yaml.load(yaml_str)
    yaml.dump(data, sys.stdout, default_flow_style=False)
    

    this prints:

    - !Table
      header:
      - - !Paragraph
          text: header1
      - - !Paragraph
          text: header2
    

    and (because I use ruamel.yaml, also an UnsafeLoaderWarning).

    If you want the output to include !Column then you should make a python class Column(list) and have column_constructor give back that type, and additionally write a representer for Column and add that to the loader.