Search code examples
javaxmlasciicharacter-entities

Encoding xml using ascii encoding instead of character entities


Alright, so here is my issue. I need to generate xml in Java to pass onto another application. I started off thinking this would be easy using an org.w3c.dom.Document. Unfortunately the application I need to pass the XML off to requires that special characters like " need to be encoded as ASCII (") instead of their character entity ("). Does anybody know a simple solution to this?

P.S. Changing the target application is not an option.

Update: So let's say my app is given the following string as input:

he will "x" this if needed

My app needs to output this:

<field value="he will &#034;x&#034; this if needed"/>

The XML generator I am using and I am guessing most others output this but this is not valid for my target:

<field value="he will &quot;x&quot; this if needed"/>

I realize my target may not quite be up to XML standards, but that doesn't help me as I have no control over it. This is my situation and I have to deal with it. Any ideas other than simply converting every special character by hand?


Solution

  • I wonder how you serialize the XML--to a string, a stream, etc. You can post-process your output to replace general entity references with their numeric equivalents, e.g.,

    sed 's/&lt;/\&#60;/g; s/&gt;/\&#62;/g; s/&amp;/\&#38;/g; s/&apos;/\&#39/g; s/&quot;/\&#34;/g'

    or

    xmlResultString.replaceAll("&lt;", "&#60;"); //etc. for other entities

    There are exactly 5 pre-defined general entities in XML (http://www.w3.org/TR/REC-xml/#sec-predefined-ent) and you can safely perform this as a textual replacement. There is no danger that it modify anything except the references (well, maybe in comments and PIs, but it doesn't sound like your scenario uses them, or that the target even accepts them).

    I agree with Mark that your target application is not a conforming XML processor. At least it comes with documentation that states explicitly where it diverges from XML. I believe the Recommendation (link above) disagrees with Christopher's comment, though it's irrelevant to OP's question as his target declares its non-conformance to the Recommendation.

    Ari.