Can an XML start with anything other than a <
character?
It was a random thought I just had, when I was trying to define how to differentiate a string containing a XML and one containing a path to a XML.
I believe the answer is no, but I'm looking to be certain.
Only a <
or a whitespace character can begin a well-formed XML document.
The W3C XML Recommendation includes a EBNF which definitively defines an XML document:
[1] document ::= prolog element Misc* [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' [27] Misc ::= Comment | PI | S [3] S ::= (#x20 | #x9 | #xD | #xA)+
From these rules it follows that an XML document may start with a whitespace character or a <
character from any one of the following constructs:
An XML document may start with no other character.
Notes:
<
and
cannot be whitespace.