Search code examples
c++xerces-c

XercesC setting output to UTF-8


I'm using XercesC Lib to create a serialization of my data. How can I set it to UTF-8? It is always generated with UTF-16 and I can't find a way to change that.

xercesc::DOMImplementation *gRegistry = xercesc::DOMImplementationRegistry::getDOMImplementation(X("Core"));
xercesc::DOMDocument *doc = gRegistry->createDocument(
        0,                      // root element namespace URI.
        X(oDocumentName.c_str()),       // root element name
        0);                 // document type object (DTD).
doc->setXmlStandalone(true);
... prepare the document ...
serializer = ((xercesc::DOMImplementationLS *)gRegistry)->createLSSerializer();
serializer->setNewLine(xercesc::XMLString::transcode("\n"));

XMLCh *xmlresult = serializer->writeToString(doc);
char *temp = xercesc::XMLString::transcode(xmlresult);
std::string result(temp);

xercesc::XMLString::release(&temp);
xercesc::XMLString::release(&xmlresult);
doc->release();
serializer->release();
getStream() << result.c_str();

When I deserialize with JAXB on the Java side, I always get a content is not allowed in prolog and so far this is the only difference I can see in the XML. When I try to locally deserialze in JAXB it works. When I take my XercesC XML I get this error. When I try to format it in Notepad++ with the XML plugin it also says that there is an error, but doesn't tell me any details.


Solution

  • Check the usage of DOMLSOutput, that should give you exactly what you want. I.e. you create a DOMLSOutput object to which you write (instead of using DOMLSSerializer::writeToString).