Search code examples
node.jsxmlxml-parsingxml2js

How do I escape characters using the Node xml2js module?


I have an xml document with an & in it, so I am getting an error

   [Error: Invalid character in entity name
   Line: 155
   Column: 63
   Char:  ]

I wrote a function to escape illegal xml characters:

const escapeIllegalCharacters = (xml) => {
  xml = xml 
    .replace(/&/g,'&')
    .replace(/"/g, '"')
    .replace(/'/g, ''')
    .replace(/>/g, '>')
    .replace(/</g, '&lt;');
  return (xml);
}

And put it into a valueProcessor:

return parse.parseString(xml, {valueProcessors: [escapeIllegalCharacters]});

But I'm still getting the same error. Is this the wrong way to escape characters using the xml2js module?


Solution

  • You need to escape the ampersands before calling parseString.

    You can use the regular expression from this answer to escape ampersands that themselves are not part of an espace sequence:

    return parse.parseString(
      xml.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&amp;')
    );
    

    Whether or not this will solve your problem will depend upon the content of your XML. For example, this simplistic mechanism will also escape any ampersands in CDATA sections, but ampersands in those sections should be left unescaped.