Search code examples
yamlpyyaml

Overriding from_yaml to add custom YAML tag


Is overriding from_yaml enough to register a tag from a class or is it necessary to use yaml.add_constructor(Class.yaml_tag, Class.from_yaml)? If I don't use te add_constructor method, my YAML tags are not recognized. Example of what I have:

import yaml

class Something(yaml.YAMLObject):

    yaml_tag = u'!Something'

    @classmethod
    def from_yaml(cls,loader,node):
        # Set attributes to None if not in file
        values = loader.construct_mapping(node, deep=True)
        attr = ['attr1','attr2']
        result = {}
        for val in attr:
            try:
                result[val] = values[val]
            except KeyError:
                result[val] = None
        return cls(**result)

Is this enough for it to work? I'm confused with the use of from_yaml vs any other constructor you would register using the method I mentioned above. I suppose there's something fundamental I'm missing, since they say:

Subclassing YAMLObject is an easy way to define tags, constructors, and representers for your classes. You only need to override the yaml_tag attribute. If you want to define your custom constructor and representer, redefine the from_yaml and to_yaml method correspondingly.


Solution

  • There is indeed no need to register explicitly:

    import yaml
    
    class Something(yaml.YAMLObject):
        yaml_tag = u'!Something'
    
        def __init__(self, *args, **kw):
            print('some_init', args, kw)
    
        @classmethod
        def from_yaml(cls,loader,node):
            # Set attributes to None if not in file
            values = loader.construct_mapping(node, deep=True)
            attr = ['attr1','attr2']
            result = {}
            for val in attr:
                try:
                    result[val] = values[val]
                except KeyError:
                    result[val] = None
            return cls(**result)
    
    yaml_str = """\
    test: !Something
       attr1: 1
       attr2: 2
    """
    
    d = yaml.load(yaml_str)
    

    which gives:

    some_init () {'attr1': 1, 'attr2': 2}
    

    But there is no need at all to use PyYAML's load() which is documented to be unsafe. You can just use safe_load if you set the yaml_loader class attribute:

    import yaml
    
    class Something(yaml.YAMLObject):
        yaml_tag = u'!Something'
    
        yaml_loader = yaml.SafeLoader
    
        def __init__(self, *args, **kw):
            print('some_init', args, kw)
    
        @classmethod
        def from_yaml(cls,loader,node):
            # Set attributes to None if not in file
            values = loader.construct_mapping(node, deep=True)
            attr = ['attr1','attr2']
            result = {}
            for val in attr:
                try:
                    result[val] = values[val]
                except KeyError:
                    result[val] = None
            return cls(**result)
    
    yaml_str = """\
    test: !Something
       attr1: 1
       attr2: 2
    """
    
    d = yaml.safe_load(yaml_str)
    

    as this gives the same:

    some_init () {'attr1': 1, 'attr2': 2}
    

    (done both with Python 3.6 and Python 2.7)

    The registering is done in the __init__() of the metaclass of yaml.YAMLObject:

    class YAMLObjectMetaclass(type):
        """
        The metaclass for YAMLObject.
        """
        def __init__(cls, name, bases, kwds):
            super(YAMLObjectMetaclass, cls).__init__(name, bases, kwds)
            if 'yaml_tag' in kwds and kwds['yaml_tag'] is not None:
                cls.yaml_loader.add_constructor(cls.yaml_tag, cls.from_yaml)
                cls.yaml_dumper.add_representer(cls, cls.to_yaml)
    

    So maybe you are somehow interfering with that initialisation in your full class definition. Try to start with a minimal implementation as I did, and add the functionality on your class that you need until things break.