Search code examples
phpxmlxmlreader

How to parse XML with unescaped ampersand


I have to read large (about 200MB) XML file, I'am using xmlreader with PHP. There is node URL with unescaped ampersand in it. Parsing always stops on first url NODE. I'm using encoding windows-1250 same as is specified in xml tag of XML file.

Iam getting error: parser error : EntityRef: expecting ';' in

Is it possible to parse an XML with & in value of NODE ?

Thank you for any tips, I can share a code if you need.


Solution

  • Is it possible to parse an XML with & in value of NODE ?

    No, that means the file is not well-formed XML at all therefore does not really qualify as an XML file and no XML file parser can deal with that otherwise it would not be an XML parser.

    However you can pre-process the data before you pass it to an XML parser and fix the issue (& -> &) your own.