Search code examples
pythonyamlpyyaml

PyYaml: cannot access nested instance attributes from within constructor class' __init__ method


I'm trying to use PyYaml add_constructor() function to return a class instance from a YAML node.

Constructor class:

class PrintAttributes():
    
    def __init__(self, a, b):
        self._a = a
        self.b = b
        print(f"instance a => attr: {self._a}, arg: {a}")
        print(f"instance b => attr: {self.b}, arg: {b}")
    
    def __repr__(self):
        return "%s(a=%r, b=%r)" % (self.__class__.__name__, self._a, self.b)

Which seems to be working as expected when instantiated manually:

>>> PrintAttributes(3,4)

instance a => attr: 3, arg: 3
instance b => attr: 4, arg: 4

PrintAttributes(a=3, b=4)

These are the two YAML files I'll be using:

simple_yaml = """
a: 3
b: 4
"""

nested_yaml = """
a: 
  value: 3
b:
  value: 4
"""

And this shows I can load them properly using yaml.load() method:

io_simple = io.StringIO(simple_yaml)
io_nested = io.StringIO(nested_yaml)
>>> d_simple = yaml.load(io_simple, Loader=yaml.SafeLoader)
>>> print(d_simple)
{'a': 3, 'b': 4}

>>> d_nested = yaml.load(io_nested, Loader=yaml.SafeLoader)
>>> print(d_nested)
{'a': {'value': 3}, 'b': {'value': 4}}

And again, if I pass the above dictionaries to the class constructor, it works properly:

>>> print(PrintAttributes(**d_simple))
instance a => attr: 3, arg: 3
instance b => attr: 4, arg: 4
PrintAttributes(a=3, b=4)

>>> print(PrintAttributes(**d_nested))
instance a => attr: {'value': 3}, arg: {'value': 3}
instance b => attr: {'value': 4}, arg: {'value': 4}
PrintAttributes(a={'value': 3}, b={'value': 4})

Now, let's create the auxiliary functions used to define the custom "constructor" and map the loaded YAML to the custom class PrintAttributes:

def PrintAttributes_constructor(loader: yaml.SafeLoader, node: yaml.nodes.MappingNode) -> PrintAttributes:
    """Construct a PrintAttributes dictionary"""
    return PrintAttributes(**loader.construct_mapping(node))
    
def get_loader():
    """Add constructors to PyYAML loader."""
    loader = yaml.SafeLoader
    loader.add_constructor('!PrintAttributes', PrintAttributes_constructor)
    return loader

And these are the above YAML files with the additional tag !PrintAttributes used to invoke the constructor:

simple_yaml_construct = """
tag_here: !PrintAttributes
 a: 3
 b: 4
"""

nested_yaml_construct = """
tag_here: !PrintAttributes
  a: 
    value: 3
  b:
    value: 4
"""

This works properly when I use the simple (i.e. not nested) YAML file:

>>> print(yaml.load(io_simple_construct, Loader=get_loader()))
instance a => attr: 3, arg: 3
instance b => attr: 4, arg: 4
{'tag_here': PrintAttributes(a=3, b=4)}

Although, when I try to load/parse the nested YAML file, I'm not able to access the arguments during instance's initialization:

>>> print(yaml.load(io_nested_construct, Loader=get_loader()))
instance a => attr: {}, arg: {}
instance b => attr: {}, arg: {}
{'tag_here': PrintAttributes(a={'value': 3}, b={'value': 4})}

My aim is to access those values during instance's initialization to update a class variable. I can do this with the first YAML file, but I cannot access the nested values of the second YAML file.

Although, I do see the nested values in the instance's representation (i.e. __repr__()). So, they are definitely loaded.

Another thing to notice is how the class' name returned by __repr__() changes from PrintAttributes to tag_here.

Am I missing anything, or is it just some PyYAML limitation?

Thanks


Solution

  • You need the additional argument deep=True:

        return PrintAttributes(**loader.construct_mapping(node, deep=True))
    

    Without this, the child dictionaries a and b are filled later, after your constructor returns. This is why they contain empty values in __init__, but proper values after construction has finished.