Search code examples
python-2.7yamlpyyamlruamel.yaml

Suppress !!python/unicode in YAML output


When dumping (ruamel.yaml, PyYAML) the dict data = {'abc': 'def'} as YAML (with default_flow_style=False) in Python 2.7 you will get:

abc: def

which is fine. However if you make all strings unicode (by u prefixing or by using from __future__ import unicode_literals) this gets dumped as:

!!python/unicode 'abc': !!python/unicode 'def'

How can I dump all strings (unicode prefixed or not) without tag, without reverting to using safe_dump()? Adding allow_unicode=True doesn't do the trick.

Complete example that generates the unwanted tags:

from __future__ import unicode_literals

import sys
import ruamel.yaml

data = {'abc': 'def'}
ruamel.yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_flow_style=False)

Solution

  • You need a different representer that handles the unicode to str conversion:

    from __future__ import unicode_literals
    
    import sys
    import ruamel.yaml
    
    def my_unicode_repr(self, data):
        return self.represent_str(data.encode('utf-8'))
    
    ruamel.yaml.representer.Representer.add_representer(unicode, my_unicode_repr)
    
    data = {'abc': u'def'}
    ruamel.yaml.dump(data, sys.stdout, allow_unicode=True, default_flow_style=False)
    

    gives:

    abc: def
    

    for PyYAML, this works as well, just replace ruamel.yaml by yaml