Search code examples
python-3.xkubernetesruamel.yaml

Ruamel.yaml: deserialize yaml into python class instaces without using tags


Imagine we have a folder of .yaml files containing kubernetes objects, let's say, deployments, configmaps and HPAs.

./file1.yaml # {'kind': 'Deployment', ... }, {'kind': 'ConfigMap', ...}
./file2.yaml # {'kind': 'ConfigMap', ... }, {'kind': 'HorizontalPodAutoscaler', ... }

I need to deserialize them into instances of a proper class, but unlike the regular deserialization method, I want to avoid relying on YAML tags and make a choice by YAML body instead (which is why i have doubts about register_class() approach). There is a key 'kind' that should identify the proper class instance.

The end goal is to parse, modify and dump those objects back (preserving comments and formatting, so those classes would be a subclass of a CommentedMap or something similar).

Is there a way in ruamel.yaml how could I parse YAML into

from ruamel.yaml.comments import CommentedMap

class KubeObjectBase(CommentedMap):

    def some_additional_func(self):
        pass

class Deployment(KubeObjectBase):

    def deployment_method(self):
        pass

class ConfigMap(KubeObjectBase):
    pass

Solution

  • I am not entirely sure what the YAML files actually look like. The part after # in your example isn't correct YAML, so I made things up.

    This doesn't affect processing to get what you want. As long as you have valid, loadable YAML, just recursive over the data and replace entries.

    You need to somehow map the value for kind to your actual classes. If there are not that many classes just make a string to class dictionary, if you have many, you should scan your Python files and create that map automatically (either from the class name or from some class attribute):

    import sys
    import ruamel.yaml
    FA = ruamel.yaml.comments.Format.attrib
    from pathlib import Path
    
    file1 = Path('file1.yaml')
    file1.write_text("""\
    - {'kind': 'Deployment', a: 1}
    - kind: ConfigMap
      b:
        kind: Deployment
        c: 3
        x: 42
    """)
    file2 = Path('file2.yaml')
    file2.write_text("""\
    [
    {'kind': 'ConfigMap', d: 4}, 
    {'kind': 'HorizontalPodAutoscaler', e: 5},
    ]
    """)
    
        
    kob_map = {}
    class KubeObjectBase(ruamel.yaml.comments.CommentedMap):
        def some_additional_func(self):
            pass
    
        def __repr__(self):
            return f"{self.__class__.__name__}({', '.join([f'{k}: {v}' for k, v in self.items()])})"
    
    class Deployment(KubeObjectBase):
        def deployment_method(self):
            pass
    kob_map['Deployment'] = Deployment
    
    
    class ConfigMap(KubeObjectBase):
        pass
    kob_map['ConfigMap'] = ConfigMap
    
    
    class HorizontalPodAutoscaler(KubeObjectBase):
        pass
    kob_map['HorizontalPodAutoscaler'] = HorizontalPodAutoscaler
    
    yaml = ruamel.yaml.YAML()
    for v in kob_map.values():
        yaml.Representer.add_representer(v, yaml.Representer.represent_dict)
    
    
    def un_kind(d, map):
        if isinstance(d, dict):
            for k, v in d.items():
                un_kind(v, map)
                try:
                    if 'kind' in v:
                        # typ = map[v.pop('kind')]
                        typ = nv = map[v['kind']]
                        d[k] = typ(v)
                        setattr(nv, FA, v.fa)
                        setattr(nv, '_comment_attrib', v.ca)
                except TypeError:
                    pass
        elif isinstance(d, list):
            for idx, elem in enumerate(d):
                un_kind(elem, map)
                try:
                    if 'kind' in elem:
                        # typ = map[elem.pop('kind')]
                        typ = map[elem['kind']]
                        d[idx] = nv = typ(elem)
                        setattr(nv, FA, elem.fa)
                        setattr(nv, '_comment_attrib', elem.ca)
                except TypeError:
                    pass
    
    
    for fn in Path('.').glob('*.yaml'):
        data = yaml.load(fn)
        print(f'{fn}:')
        un_kind(data, kob_map)
        print(list(data))
        yaml.dump(data, sys.stdout)
    

    which gives:

    file1.yaml:
    [Deployment(kind: Deployment, a: 1), ConfigMap(kind: ConfigMap, b: Deployment(kind: Deployment, c: 3, x: 42))]
    - {kind: Deployment, a: 1}
    - kind: ConfigMap
      b:
        kind: Deployment
        c: 3
        x: 42
    file2.yaml:
    [ConfigMap(kind: ConfigMap, d: 4), HorizontalPodAutoscaler(kind: HorizontalPodAutoscaler, e: 5)]
    [{kind: ConfigMap, d: 4}, {kind: HorizontalPodAutoscaler, e: 5}]