Using Stax, I'm surprised to find that an XML block such as:
<badger>
<![CDATA[Text about a badger]]>
</badger>
is treated as if it were:
START_ELEMENT (badger)
CHARACTERS ( Text about a badger )
END_ELEMENT (badger)
That is, the CDATA and the surrounding text are flattened into one text element. There is no CDATA element detected.
Is this correct behaviour? How can I separate the whitespace from the CDATA?
I am using the woodstox implementation.
I don't know about the woodstox implementation, but could this bug, resolved in 2006, still be a factor? Are you setting the optional report-cdata-event property?
(See also this message about a similar problem.)