Search code examples
xmlspecial-characterssax

Avoid converting XML special characters with SAX parser


I'm using a SAX parser (ContentHandler) to parse some XML.

Is it possible to preserve special XML characters?

For example instead of parsing & as &, is it possible to keep the &?


Solution

  • You can't do that I think. But note that you have org.xml.sax.EntityResolver to do a custom resolution of external entities (not & and the likes) and you also have org.xml.sax.ext.LexicalHandler to get infos about the start end end of an entity if your SaX implementation got it. I think it might be helpful in locating the entities.

    You could also reintroduce all entity references as suggested by @MichaelKay