Search code examples
pythonpyyaml

Deserializing dictionary with custom class keys fail in PyYAML


I was trying to use PyYAML to serialize a dictionary which uses instances of SampleClass as keys. It serializes OK, but when I am trying to load it with yaml.load(), it raises an exception:

AttributeError: 'SampleClass' object has no attribute 'name'

How can this be fixed? The SampleClass looks like this:

import uuid

class SampleClass:

    def __init__(self, name = "<NO NAME>"):
        self.objects = []
        self.name = name
        self.id = uuid.uuid1()

    def __eq__(self, other):
        if isinstance(other, SampleClass):
            return self.name == other.name and \
                self.objects == other.objects and \
                self.id == other.id
        else:
            return False

    def __hash__(self):
        return hash((str(self.name), str(self.id)))

Solution

  • PyYAML is a bit outdated, it only supports YAML 1.1 which has been superseded by YAML 1.2 back in 2009. Also note that although PyYAML can parse complex keys in YAML mappings (e.g. keys that are sequences or mappings themselves), keys that are valid in YAML, it fails on constructing them in Python, effectively not being able to load these.

    With ruamel.yaml (disclaimer: I am the author of that package), you can simply do:

    import sys
    import uuid
    import ruamel.yaml
    from ruamel.yaml.compat import StringIO
    
    class SampleClass:
    
        def __init__(self, name = "<NO NAME>"):
            self.objects = []
            self.name = name
            self.id = uuid.uuid1()
    
        def __eq__(self, other):
            if isinstance(other, SampleClass):
                return self.name == other.name and \
                    self.objects == other.objects and \
                    self.id == other.id
            else:
                return False
    
        def __hash__(self):
            return hash((str(self.name), str(self.id)))
    
        def __repr__(self):
            return "SampleClass({})".format(self. name)
    
    data = {SampleClass("abc"): 1, SampleClass("xyz"): 42}
    
    yaml = ruamel.yaml.YAML(typ="unsafe")
    buf = StringIO()
    yaml.dump(data, buf)
    x = yaml.load(buf.getvalue())
    print(x)
    

    which gives:

    {SampleClass(abc): 1, SampleClass(xyz): 42}
    

    I do recommend however to provide to_yaml and from_yaml routines to SampleClass and registering the class (doc). This allows you to do away with the unsafe loading (which is BTW the default for PyYAML).