Is there an online LIBXML2 XML parser available or a way to parse XML with libxml2 standalone?

We are currently on a module trying to parse XML using LIBXML2 component and have found an issue related to it when a XML containing a namespace containing non-ASCII character such as this é.

Sample XML file:

< ?xml version="1.0" encoding="UTF-8"?>
<SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP:Body>
    <Helloé xmlns="http://schemas/Helloé">
      <ns0:Helloé xmlns:ns0="http://schemas/Helloé" />
    </Helloé>
  </SOAP:Body>
</SOAP:Envelope>

We were able to check and confirm that this is supported by the DOM parser by testing it with a small test program. When we have tried to check for the validity of this scenario provided, by the W3School XML online parser we are getting the following error:

We have tested it through the other online sources too, as like this even which says the same - the same error message.

Can anyone please let us know if there is a way to identify an online tool/resource where we can pinpoint this to libxml2?

Or a sample program that can test this?

Solution

Simply run the file through libxml2's xmllint on the command line:

$ xmllint --noout so.xml
so.xml:4: namespace error : xmlns: 'http://schemas/Helloé' is not a valid URI
    <Helloé xmlns="http://schemas/Helloé">
                                           ^
so.xml:5: namespace error : xmlns:ns0: 'http://schemas/Helloé' is not a valid URI
      <ns0:Helloé xmlns:ns0="http://schemas/Helloé" />
                                                     ^

Also, replacing é with the correct UTF-8 percent-escape works. Just change the URI to http://schemas/Hello%C3%A9.