Search code examples
pythonpython-3.xyamlpyyaml

Can't construct object from parameter in constructor called by PyYAML


I have a YAML file that looks like this:

---
!Frog
    name: tree frog
    colour: green
    friends:
        - !Frog
          name: motorbike frog
        - !Frog
          name: blue arrow frog

And a python program that uses PyYAML to create objects according to the file:

import yaml

class Frog():
    def __init__(self, name, colour="", friends=None):
        self.name = name
        self.colour = colour
        self.friends = {}
        if friends != None:
            for f in friends:
                self.friends[f.name] = f
        print("{}'s friends: {}".format(self.name, self.friends))

# Constructor for YAML
def frogConstructor(loader, node) :
    fields = loader.construct_mapping(node)
    return Frog(**fields)

yaml.add_constructor('!Frog', frogConstructor)

f = open("frog.yaml")
loaded = yaml.load(f)

As you can see in the above code, I'm trying to make a self.friends dictionary from the friends parameter (where the key is the frog's name and the value is the actual frog object) to the __init__ method. However, the code above results in the following output:

tree frog's friends: {}
motorbike frog's friends: {}
blue arrow frog's friends: {}

As you can see, the self.friends dictionary is empty for all three of the frogs, but the tree frog should have two friends. If I simply make self.friends = friends, it works as expected: self.friends is a list of the friend frogs. What am I doing wrong?


Solution

  • That things work if you do self.friends = friends is not so strange. You assign an initially empty list to self.friends, a list that later gets appended to by the YAML parser.

    If you want that list to be filled before constructing your Frog(), you'll have to provide the deep=True parameter for construct_mapping(), doing so will make sure the underlying non-scalar constructs are created first as well as the scalar ones.

    def frogConstructor(loader, node):
        fields = loader.construct_mapping(node, deep=True)
        return Frog(**fields)
    

    There are however a few more problems with your code (none of them prohibiting the above to function though):

    • there is only one None, so it is more appropriate to use if friends is not None: than if friends != None:
    • yaml.load is unsafe, so if you have no complete control over your input, that might mean a wiped disc (or worse). PyYAML doesn't warn you for that (in my ruamel.yaml parser you explicitly have to provide the unsafe Loader to prevent a warning message).
    • If tree frog is narcissist enough to consider itself a friend of itself, or if one of its friends considers tree frog a friend, you might want to use an anchor and alias to indicate so (and not just use the same name on a different Frog), and that is not going to work with the simple constructor you are using.
    • frogConstructor, as a function name, should not be camel case, use frog_constructor instead.

    Because of the above, I would not use the deep=True parameter but go for a safer, and more complete solution by having a two stage constructor:

    from ruamel import yaml
    
    class Frog():
        def __init__(self, name):
            self.name = name
    
        def set_values(self, colour="", friends=None):
            self.colour = colour
            self.friends = {}
            if friends is not None:
                for f in friends:
                    self.friends[f.name] = f
            print("{}'s friends: {}".format(self.name, self.friends))
    
        def __repr__(self):
            return "Frog({})".format(self.name)
    
    # Constructor for YAML
    def frog_constructor(loader, node):
        fields = loader.construct_mapping(node)
        frog = Frog(fields.pop('name'))
        yield frog
        frog.set_values(**fields)
    
    yaml.add_constructor('!Frog', frog_constructor, yaml.SafeLoader)
    
    f = open("frog.yaml")
    loaded = yaml.safe_load(f)
    

    with that you can parse this frog.yaml:

    !Frog &tree_frog
        name: tree frog
        colour: green
        friends:
            - !Frog
              name: motorbike frog
              friends:
                - *tree_frog
            - !Frog
              name: blue arrow frog
    

    to get as output:

    tree frog's friends: {'blue arrow frog': Frog(blue arrow frog), 'motorbike frog': Frog(motorbike frog)}
    motorbike frog's friends: {'tree frog': Frog(tree frog)}
    blue arrow frog's friends: {}