Search code examples
pythonruamel.yaml

Python ruamel.yaml library adds new lines where not expected


I'm using ruamel.yaml to load and edit a specific property in a yaml file.

I need to preserve everything else as-is. So far, the following code is working almost perfect:

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes=True
yaml.explicit_start=True
yaml.indent(mapping=6, sequence=4, offset=2)

data = {}
with open("my.yaml", "r") as f:
    data = yaml.load(f)

data["my::property::user::name"] = "me"

with open("my.yaml", "w") as f:
    yaml.dump(data, f)

The yaml file is big, with a lot of properties and I can't get the following to work:

yaml.dump add a new line for the following key:

my::property::group_name: "path\\Domain Admins"

Resulting in:

my::property::group_name: "%{path\\Domain
      Admins"

For some properties, it adds a new line right after the : :

my::property::value: some-really-big-string-here

Result in:

my::property::value:
      some-really-big-string-here

EDITED:

The follwing two lines will have a third \ added and the line will also break:

some::random::name: "\\\\%{expression}\\%{expression}"
another::random::name: "\\\\%{expression}\\pathname\\"

The result is:

some::random::name: "\\\\%{expression}\\\
      %{expression}"
another::random::name: "\\\\%{expression}\\\
      pathname\\"

Maybe it's my yaml file that need some data fix, but is it possible to avoid this at the parser level ?


Solution

  • I haven't tried to reproduce what you are getting, because I am pretty sure I can't with the examples given.

    The dumper routine tries to fit key: value pairs on a line. The default line length is 80 characters.

    If a value doesn't fit behind a key on a line it can be wrapped. In that case it will be quoted and split on a (single) space and a newline is inserted followed by enough spaces not to cause problems with indentation. If necessary this is repeated.

    If the value cannot be split (because it has no spaces), it will be put on its own on the next line, indented relative to the start of the key. Or on some situations it will insert backslash newline.

    This still can lead to overflowing the 80 characters which is then overruled. If you have a large indent for mappings (as you do), and small keys (smaller then the indent) this might not happen.

    The most direct way to influence this is by setting:

     yaml.width = 4096  
    

    (choose a value that is larger than your longest line). This will cause all the values to be behind the corresponding key.

    You could also explicitly "convert" values to ruamel.yaml.scalarstring.LiteralScalarString and then get key: value pairs looking like:

    my::property::group_name: |
          %{path\\Domain Admins
    

    Whatever representation the dumper chooses, depending on your settings, the string that it reads back is the same as the original. So apart from aestetical reasons you should not care.

    There is no API, or easy hooks that allow you to influence the dumper to always/never insert a newline after the value indicator (the ':'+ space after the key). So I hope using yaml.width is the easy acceptable solution for you.

    (You can also leave the indent at the more normal default values and have less chance of overflowing the standard width)