Search code examples
c++xml-parsingpugixml

PugiXML C++ Newline handling issue: '\n\n' becomes '\\n\\n'


Recently I have been using XML files for a localisation system for a game I am developing (which uses Cocos2d-X). I am using PugiXML to parse the XML localisation file. I have run into an issue when parsing English strings with newline (\n) characters in them. Put more accurately, in the XML file:

enter image description here

The problem is that PugiXML parses these \n\n characters into \\n\\n. Here's how it looks like on the debugger:

enter image description here

Note that the string the system is trying to translate is

englishstring (englishstring std::__1::string "(We?[lcom]*e)(to|Regex).+\n\n(tap to continue)")

and the string that pugixml is returning is

engString(engString std::__1::string "(We?[lcom]*e)(to|Regex).+\\n\\n(tap to continue)")

Obviously comparing these two strings fails, so nothing gets translated. Plus, Cocos2d-X won't draw a newline with \\n so setting all linebreaks to that isn't an option either.

I understand this is probably desired behaviour, but my question is: how can I fix/disable this behaviour? I have set my XML parsing mode to pugi::parse_minimal but it still returns \\n characters.


Solution

  • In c++ a backslash in a string must be escaped with another backslash. So if the text contains \, it will be displayed in a string as "\\", even though it contains just a single backslash.

    The character following the backslash, does not change this. \n in the xml are just two characters, one of them is a backslash. So \n becomes "\\n"

    You seem to expect a newline in the xml. In that case you must use the xml entity 
 to represent the newline. So instead of

    <string16 englishstring="(We?[lcom]*e)(to|Regex).+\n\n(tap to continue)">
    

    use this instead

    <string16 englishstring="(We?[lcom]*e)(to|Regex).+&#10;&#10;(tap to continue)">
    

    Or you could put a verbatim newline in the xml. That will look like this:

    <string16 englishstring="(We?[lcom]*e)(to|Regex).+
    
    (tap to continue)">
    

    But not all xml parsers support that, I don't know about PugiXML.