Search code examples
pythonpython-3.xyamlpyyaml

pyyaml and yaml sublevels


I´m trying to use PyYAML to parse a YAML file into a python object

However I doubt raised during the course

I have the YAML file

first_lvl:
   second_lvl:
       item_a:
          - value_a : "value aaa"
          - value_b : "value bbb"

My python script reads and loads the YAML into an object

import yaml

class Struct:
    def __init__(self, **entries):
        self.__dict__.update(entries)

with open(job_file.yml) as f:
    skeleton = yaml.full_load(f)

MyJob = Struct(**skeleton)

print(MyJob.first_lvl)

And that that works fine but only for the first level of the YAML. How about if I want to reach the sub-level values of yaml file that suppose to be contained into the object

like this :

    print(MyJob.first_lvl.second_lvl) 

It might not be related with a PyYAML module thing and more the way python handles objects, but I´m still lost

Can anyone shed some lights?


Solution

  • In your comments you indicate you would prefer to have only one Struct instance loaded. However if you want to access the value "value aaa" by writing my_job.first_lvl.second_lvl.item_a.0.value_a, you cannot do so by providing the __getattr__ method, as this will throw an error stating that the dict object has not attribute second_lvl (it only has a key second_lvl). The __getattr__ never gets called because, on the Struct instance, attribute lookup doesn't fail.

    What you can do is provide some method lookup that as argument takes a "dotted" string:

    import ruamel.yaml
    
    yaml_str = """\
    first_lvl:
       second_lvl:
           item_a:
              - value_a : "value aaa"
              - value_b : "value bbb"
    """
    
    class Struct:
        def __init__(self, **entries):
            self.__dict__.update(entries)
    
        def lookup(self, s):
            def recurse(d, names):
                name = names[0]
                if isinstance(d, list):  # list indices cannot be strings
                    name = int(name)                
                if len(names) > 1:
                    return recurse(d[name], names[1:])
                return d[name]
    
            names = s.split('.')
            return recurse(getattr(self, names[0]), names[1:])
    
    yaml = ruamel.yaml.YAML(typ='safe')
    my_job = Struct(**yaml.load(yaml_str))
    
    print(my_job.lookup("first_lvl.second_lvl.item_a.0.value_a"))
    

    which gives:

    value aaa
    

    This answer shows an alternative extending the default data-structure when using ruamel.yaml in round-trip-mode.

    If you really want to write my_job.first_lvl.second_lvl.item_a.0.value_a, without quotes, I don't think there is a way around making every level aware of looking up attributes. That means extending Struct, so that the class can be constructed from mapping and sequences. That can be done after loading the YAML, but IMO is best done during the construction of the YAML.