I would like to have a custom ruamel.yaml
dumper that uses Literal style for all multiline strings and the default style otherwise. For example:
import sys
import ruamel.yaml
data = {"a": "hello", "b": "hello\nthere\nworld"}
print("Default style")
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)
print()
print("style='|'")
yaml = ruamel.yaml.YAML()
yaml.default_style = "|"
yaml.dump(data, sys.stdout)
This produces:
Default style
a: hello
b: "hello\nthere\nworld"
style='|'
"a": |-
hello
"b": |-
hello
there
world
My desired output is:
a: hello
b: |-
hello
there
world
There are multiple ways to achieve what you want. If you have control over building up the
data structure, it is often easiest to add a LiteralScalarString
if appropriate:
import sys
import ruamel.yaml
def lim(s): # literal if multi-line
if '\n' in s:
return ruamel.yaml.scalarstring.LiteralScalarString(s)
return s
data = {'a': lim('hello'), 'b': lim('hello\nthere\nworld')}
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)
which gives:
a: hello
b: |-
hello
there
world
This gives you easy fine control over what gets dumped as literal style.
If you don't add all the data individually (but e.g. read them from a JSON file), you can walk over your data structure after it is fully constructed and update it in place:
import sys
import ruamel.yaml
def tmtl(d):
"""translate multi-line to literal,
only acts on dict values and sequence items, not on keys
"""
if isinstance(d, dict):
for k, v in d.items():
if isinstance(v, str) and '\n' in v:
d[k] = ruamel.yaml.scalarstring.LiteralScalarString(v)
else:
tmtl(v)
elif isinstance(d, list):
for idx, item in enumerate(d):
if isinstance(item, str) and '\n' in item:
d[idx] = ruamel.yaml.scalarstring.LiteralScalarString(item)
data = {'a': 'hello', 'b': 'hello\nthere\nworld'}
tmtl(data)
yaml = ruamel.yaml.YAML()
yaml.dump(data, sys.stdout)
which gives:
a: hello
b: |-
hello
there
world
If you cannot update your data
, you could rewrite tmtl
in the program above
so it builds a new data structure and returns that, but at that point it is IMO
easier to change the representer:
import sys
import ruamel.yaml
CKS = ruamel.yaml.comments.CommentedKeySeq # so you can have sequences as keys in a mapping
class MyRepresenter(ruamel.yaml.representer.RoundTripRepresenter):
def represent_str(self, s):
if '\n' in s:
return self.represent_scalar('tag:yaml.org,2002:str', s, style='|')
return self.represent_scalar('tag:yaml.org,2002:str', s)
MyRepresenter.add_representer(str, MyRepresenter.represent_str)
data = {'a': 'hello', 'b': 'hello\nthere\nworld', CKS((1, 2)): ['nested works\nas well\n\n']}
yaml = ruamel.yaml.YAML()
yaml.Representer = MyRepresenter
yaml.dump(data, sys.stdout)
which gives:
a: hello
b: |-
hello
there
world
[1, 2]:
- |+
nested works
as well
...
As you can see the trailing newlines of the final literal style scalar automatically
causes the chomping indicator to change from strip (-
) to keep (+
) and the
explicit document end marker (...
) to appear.