Search code examples
cxmlhtml-entitiestinyxml

TinyXML and preserving HTML Entities


I'm using TinyXml to parse some XML that has some HTML Entities embedded in text nodes. I realize that TinyXML is just an XML parser, so I don't expect or even want TinyXML to do anything to the entities. In fact I want it to leave them alone.

If I have some XML like this:

...
<blah>&uuml;</blah>
...

Calling Value() on the TiXmlText instance I get:

"uuml;"

So TinyXml always seems to remove the ampersand. Is there any way I can get it to leave it alone so it comes out unchanged?

Appreciate any ideas..


Solution

  • If you look at the TinyXML documentation you'll see that it recognizes only five character entities (&uuml; is not one of them), plus Unicode code point syntax &#xA0; or &#160;.