lt;dr: Is there a YAML parser- & serializer library, maybe using a DOM representation, which can preserve non-semantic structures and formatting aspects of YAML through a parse-&-serialize round-trip, so that the file's content does not change?
Context:
I have a software which is configured by YAML files.
The users are explicitly allowed to, even encouraged, to edit these YAML (v1.2) files in their favorite text editor. This obviously includes ordering of entries in objects, non-significant white spaces, like extra blank lines to structure the file, different types of multi-line string representations, including substructures as JSON, comments, etc.
For less experienced users I now want to write a GUI utility to edit these files as well. I am not limited in the choice of programming language here. CSharp would be nice, but it's not a must. Open Source would be good.
The important requirement is: that the utility must preserve all manual edits of the YAML file when saving!
For example: default YAML serializers would not reconstruct stuff which is not present in the data model, like comments, or non-semantic blank lines, etc. However, it would be bad UX if a utility would eat up these things that a user manually added, likely on purpose.
YAML has no DOM specification, and while most YAML libraries offer access to a node-based graph-like structure similar to a DOM, these are never exact representations of how the file looks like.
Even implementations that do store comments (e.g. go-yaml, ruamel) don't store their exact layout and are not able to reliably reproduce them as they were.
Whitespace is stored by no implementation I know, at best you can reconstruct it from the line & column numbers in the nodes but the implementations don't do that.
I have three suggestions for you, one of those might be a viable solution for your use-case:
Suggestion 1: Use XML
Your requirements are fulfilled by XML. It is not pretty and not modern but its design caters to your exact use-case where YAML does not. It might be too late in the development cycle to do this, but if it's not, think about it.
Suggestion 2: Separate use-cases for experienced and inexperienced users
Is the same file both edited by an experienced user in an editor and an inexperienced user with a GUI? If not, why would it even be a use-case to preserve comments when the user never adds any?
Your editor could load a file, serialize it and if it's not identical, show some warning like
This file has been edited manually, editing it with the GUI will drop all comments and non-standard formatting
You could even just go with a JSON GUI (those already exist even supporting schemas) because YAML can load JSON, and then inexperienced users simply work on JSON files, which does not matter since they only ever edit them via GUI.
Suggestion 3: Hack your way through even if YAML implementations don't support you
What you can do is to use the line & column information on nodes from your YAML implementation. For example, this might be an implementation that yields decent results:
This will not be perfect, for example it only works if the indentation step used in the input equals the indentation step used by the YAML implementation, but it might be good enough.
Of all suggestions, this is the most complex one and the most error-prone. Do not do this if you don't understand the implications.