I understand why control characters are illegal in XML 1.0, but still I need to store them somehow in XML payload and I cannot find any recommendations about escaping them. I cannot upgrade to XML 1.1.
How should I escape e.g. SOH character (\u0001
- standard separator for FIX messages)?
The following doesn't work:
<data></data>
One way is to use processing instructions: <?hex 01?>
. But that only works in element content, not in attributes. And of course the processing instruction needs to be understood by the receiving application.
You could also use elements: <hex value="01"/>
but elements are visible in an XSD schema or DTD, while processing instructions are hidden.
Another approach is that if a piece of payload can contain such characters, then put the whole payload in Base64 encoding.