Search code examples
pythonserializationdeserializationpyyaml

How to ignore attributes when using yaml.dump?


I am trying to hide certain attributes when performing a YAML dump on an object. I tried this answer and it allows you to dump, but loading does not work.

import yaml
from copy import deepcopy

class SecretYamlObject(yaml.YAMLObject):
    hidden_fields = []

    @classmethod
    def to_yaml(cls, dumper, data):
        new_data = deepcopy(data)
        for item in cls.hidden_fields:
            del new_data.__dict__[item]
        return dumper.represent_yaml_object(cls.yaml_tag, new_data, cls,
                                            flow_style=cls.yaml_flow_style)

class Trivial(SecretYamlObject):
    hidden_fields = ["_ignore"]
    yaml_tag = u'!!Trivial'
    def __init__(self):
        self.a = 1
        self.b = 2
        self._ignore = 3

Running this code

import yaml
yaml.load('!Trivial {a: 1, b: 2}')

I get the following error:

ConstructorError: could not determine a constructor for the tag '!Trivial' in "", line 1, column 1: !Trivial {a: 1, b: 2}

I've tried to 'hack' it so that yaml can find the constructor using a classproperty using something like this:

class SecretYamlObject(yaml.YAMLObject):
    # ... same as before... remove for brevity

class Trivial(SecretYamlObject):
    hidden_fields = ["_ignore"]

    @classproperty # decorator definition not shown here for brevity 
    def yaml_tag(cls):
        return ('!!')+'python/object:'+'{}.{}'.format(cls.__module__, cls.__name__)

    def __init__(self):
        self.a = 1
        self.b = 2
        self._ignore = 3

This produces a bad yaml string:

import yaml
from secret_yaml2 import Trivial
print yaml.dump(Trivial())

!%21python/object:secret_yaml2.Trivial {a: 1, b: 2}

For some reason, it converts the second ! to %21... which again causes a constructor error.

BTW, The below works, but I would need to know the class of the object before loading the yaml, which I might not.

import yaml
from secret_yaml import Trivial
yaml.load(yaml.dump(Trivial()))

I'm trying to make a class which knows how to properly dump itself to yaml, but I can still load it through a normal yaml.load call.


Solution

  • You can define __getstate__ which is used by both Python native pickle and PyYaml:

    class A(object):
        def __init__(self):
            self.hidden = 42
            self.visible = 5
    
        def __getstate__(self):
            state = self.__dict__.copy()
            del state['hidden']
            return state
    
    a = A()
    d = yaml.dump(a)
    print(d)
    

    This prints:

    !!python/object:__main__.A
    visible: 5
    

    This is the best way to ignore or hide attribute from serializers like PyYaml or Pickle.