The code below reproduces the issue:
from ruamel.yaml import round_trip_load, round_trip_dump, YAML
import sys
obj = round_trip_load('''\
a: 1 # comment for a
b: # comment for b
c: 2 # comment for c
''')
print('-------------------------- YAML().dump(obj, sys.stdout)')
YAML().dump(obj, sys.stdout)
print('-------------------------- print(round_trip_dump(obj))')
print(round_trip_dump(obj))
Output
-------------------------- YAML().dump(obj, sys.stdout)
a: 1 # comment for a
b: # comment for b
c: 2 # comment for c
-------------------------- print(round_trip_dump(obj))
a: 1
b: # comment for b
c: 2
You can see that YAML().dump(obj, sys.stdout)
correctly prints the comments, but print(round_trip_dump(obj))
loses comment for a
and comment for c
.
If you run it in debug mode, you can see obj.ca.items
correctly keep the comments.
Therefore I believe it is a bug in round_trip_dump()
.
I was going to create a ticket for the project, but the instruction asked me to post here first, to clarify that it is not a faulty usage of the library.
Is it a bug? Or my misusage?
The document asks why output of dump() as a string is necessary.
Because I want to convert the object into a string and save as VARCHAR in database.
Thank you in advance.
Environment: ruamel.yaml==0.17.21 with Python 3.11 on 64 bit Windows
This definitely falls under misusage. The documentation you refer to explains why a new API was needed,and you
combine round_trip_dump
/round_trip_load
from the old API with a YAML()
instance from the new API
and expect things to work, which they don't ( otherwise a new API would probably not have
been necessary ). Don't use the old API, for any new projects.
If you have a streaming API (like the one currently in ruamel.yaml
) you can easily grab the output,
using an io.BytesIO()
buffer:
import sys
import io
import ruamel.yaml
yaml_str = """\
a: 1 # comment for a
b: # comment for b
c: 2 # comment for c
"""
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(yaml_str)
buf = io.BytesIO() # yamp.dump generates a stream of utf-8/bytes
yaml.dump(data, buf)
yaml_out = buf.getvalue().decode("utf-8")
print(f'here is the streamed output:\n{yaml_out}')
which gives:
here is the streamed output:
a: 1 # comment for a
b: # comment for b
c: 2 # comment for c
So stick with the new API as it is trivial to convert the output of a stream based API to a string yourself.
Interpreting a stream parameter with a default value None
as
"return as a string" is IMO not the right thing to built in the (new) API. In the same
way that e.g. Python's datatime.date()
constructor creates an object and doesn't have an option to stream
built in
(so you can't do datetime.date(2023, 4, 18, stream=sys.stdout)
), even though I can image there
is someone out there that might want to do that.
(tested on macOS and Linux, but it should work on Windows)