Search code examples
pythonyamlpyyaml

How can I control what scalar form PyYAML uses for my data?


I've got an object with a short string attribute, and a long multi-line string attribute. I want to write the short string as a YAML quoted scalar, and the multi-line string as a literal scalar:

my_obj.short = "Hello"
my_obj.long = "Line1\nLine2\nLine3"

I'd like the YAML to look like this:

short: "Hello"
long: |
  Line1
  Line2
  Line3

How can I instruct PyYAML to do this? If I call yaml.dump(my_obj), it produces a dict-like output:

{long: 'line1

    line2

    line3

    ', short: Hello}

(Not sure why long is double-spaced like that...)

Can I dictate to PyYAML how to treat my attributes? I'd like to affect both the order and style.


Solution

  • Based on Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

    import yaml
    from collections import OrderedDict
    
    class quoted(str):
        pass
    
    def quoted_presenter(dumper, data):
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')
    yaml.add_representer(quoted, quoted_presenter)
    
    class literal(str):
        pass
    
    def literal_presenter(dumper, data):
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
    yaml.add_representer(literal, literal_presenter)
    
    def ordered_dict_presenter(dumper, data):
        return dumper.represent_dict(data.items())
    yaml.add_representer(OrderedDict, ordered_dict_presenter)
    
    d = OrderedDict(short=quoted("Hello"), long=literal("Line1\nLine2\nLine3\n"))
    
    print(yaml.dump(d))
    

    Output

    short: "Hello"
    long: |
      Line1
      Line2
      Line3