I want to get YAML file comments on header lines, like
# 11111111111111111
# 11111111111111111
# 22222222222222222
# bbbbbbbbbbbbbbbbb
---
start:
....
And I used the ca
attribute on the loaded data, butfound there are no these comments on it. Is there any other way to get these comments?
Currently (ruamel.yaml==0.17.17
) the comments that occur
before the document start token (---
) are not passed on from the
DocumentStartToken
to the DocumentStartEvent
, so these comments are
effectively lost during parsing. Even if they were passed on, it is
non-trivial to preserve them as the DocumentStartEvent
is silently
dropped during composition.
You can either put the comments after the end of directives indicator
(---
) which allows you to get at the comments using the .ca
attribute without a problem, or remove that indicator altogether as it
is superfluous (at least in your example). Alternatively you will have to
write a small wrapper around the loader:
import sys
import pathlib
import ruamel.yaml
fn = pathlib.Path('input.yaml')
def load_with_pre_directives_comments(yaml, path):
comments = []
text = path.read_text()
if '\n---\n' not in text and '\n--- ' not in text:
return yaml.load(text), comments
for line in text.splitlines(True):
if line.lstrip().startswith('#'):
comments.append(line)
elif line.startswith('---'):
return yaml.load(text), comments
break
yaml = ruamel.yaml.YAML()
yaml.explicit_start = True
data, comments = load_with_pre_directives_comments(yaml, fn)
print(''.join(comments), end='')
yaml.dump(data, sys.stdout)
which gives:
# 11111111111111111
# 11111111111111111
# 22222222222222222
# bbbbbbbbbbbbbbbbb
---
start: 42