Search code examples
c++xmlboostunicodeboost-propertytree

Boost Property ptree: boost write_xml adding unicode 0x0 character in child element in xml file


I am using boost write_xml function to create xml. I am able to create successful xml using Boost. But It is adding extra unicode 0x0 character at end of xml child element.

code snippet:

boost::property_tree::write_xml(oss, pt, boost::property_tree::xml_writer_make_settings<std::string>(' ', 4));

I am sending this xml to Java Side application and Java throwing below exception error while parsing boost created xml.

An Invalid XML character(Unicode: 0x0) was found in the element content of the document error

Anyone know, how to remove unicode 0x0 character from XML while creating xml using boost property ptree.


Solution

  • Your data has embedded NUL bytes. One way to achieve this:

    std::string const hazard("erm\0", 4); 
    boost::property_tree::ptree pt;
    pt.put("a.b.c.<xmlattr>.d", hazard);
    

    UPDATE

    Upon closer inspection, NUL-bytes are simply not supported in XML, full stop (Storing the value Null (ASCII) in XML).

    Either get rid of the offending byte, or use some kind of encoding, like base64.


    Old analysis and demonstration follows

    Mind you, Property Tree is not an XML library, and as such could have limitations that don't conform to the XML standard.

    I still think this is a BUG, since it doesn't roundtrip: Property Tree cannot read its own serialized property tree back:

    Live On Coliru

    #include <boost/property_tree/xml_parser.hpp>
    #include <iostream>
    
    int main() {
        std::string const hazard("erm\0", 4); 
    
        {
            std::ofstream ofs("NULbyte.xml");
    
            boost::property_tree::ptree pt;
            pt.put("a.b.c.<xmlattr>.d", hazard);
    
            write_xml(ofs, pt);
        }
        {
            std::ifstream ifs("NULbyte.xml");
    
            boost::property_tree::ptree pt;
            read_xml(ifs, pt);
            std::cout << (hazard == pt.get<std::string>("a.b.c.<xmlattr>.d")) << "\n";
        }
    }
    

    You can correctly use the JSON backend if you want: Live On Coliru