I wrote a simple tool to generate a DBUnit XML dataset using queries that the user enters. I want to include each query entered in the XML as a comment, but the DBUnit API to generate the XML file doesn't support inserting the comment where I would like it (above the data it generates), so I am resorting to putting the comment with ALL queries either at the top or bottom.
So my question: is it valid XML to place it at either location? For example, above the XML Declaration:
<!-- Queries used: ... -->
<?xml version='1.0' encoding='UTF-8'?>
<dataset>
...
</dataset>
Or below the root node:
<?xml version='1.0' encoding='UTF-8'?>
<dataset>
...
</dataset>
<!-- Queries used: ... -->
I plan to initially try above the XML Declaration, but I have doubts on if that is valid XML, despite the claim from wikipedia:
Comments can be placed anywhere in the tree, including in the text if the content of the element is text or #PCDATA.
I plan to post back if this works, but it would be nice to know if it is an official XML standard.
UPDATE: See my response below for the result of my test.
According to the XML specification, a well-formed XML document is:
document ::= prolog element Misc*
where prolog
is
prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?
and Misc
is
Misc ::= Comment | PI | S
and
XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
which means that, if you want to have comments at the top, you cannot have an XML type declaration.
You can, however, have comments after the declaration and outside the document element, either at the top or the bottom of the document, because Misc*
can contain comments.
The specification agrees with Wikipedia on comments:
2.5 Comments
[Definition: Comments may appear anywhere in a document outside other markup; in addition, they may appear within the document type declaration at places allowed by the grammar. They are not part of the document's character data; an XML processor MAY, but need not, make it possible for an application to retrieve the text of comments. For compatibility, the string "--" (double-hyphen) MUST NOT occur within comments.] Parameter entity references MUST NOT be recognized within comments.
All of this together means that you can put comments anywhere that's not inside other markup, except that you cannot have an XML declaration if you lead with a comment.
However, while in theory theory agrees with practice, in practice it doesn't, so I'd be curious to see how your experiment works out.