Search code examples
javaxmlescapingstax

How do I write unescaped XML outside of a CDATA


I am trying to write XML data using Stax where the content itself is HTML

If I try

xtw.writeStartElement("contents");
xtw.writeCharacters("<b>here</b>");
xtw.writeEndElement();

I get this

<contents>&lt;b&gt;here&lt;/b&gt;</contents>

Then I notice the CDATA method and change my code to:

xtw.writeStartElement("contents");
xtw.writeCData("<b>here</b>");
xtw.writeEndElement();

and this time the result is

<contents><![CDATA[<b>here</b>]]></contents>

which is still not good. What I really want is

<contents><b>here</b></contents>

So is there an XML API/Library that allows me to write raw text without being in a CDATA section? So far I have looked at Stax and JDom and they do not seem to offer this.

In the end I might resort to good old StringBuilder but this would not be elegant.

Update:

I agree mostly with the answers so far. However instead of <b>here</b> I could have a 1MB HTML document that I want to embed in a bigger XML document. What you suggest means that I have to parse this HTML document in order to understand its structure. I would like to avoid this if possible.

Answer:

It is not possible, otherwise you could create invalid XML documents.


Solution

  • The issue is that is not raw text it is an element so you should be writing

    xtw.writeStartElement("contents");
    xtw.writeStartElement("b");
    xtw.writeCData("here");
    xtw.writeEndElement();
    xtw.writeEndElement();