Search code examples
javaxmlxsd-validationstaxsjsxp

Trying to strip namespace from XML before validating using Streaming parser (SJSXP)


I'm trying to ignore the namespaces provided in the root element of an XML file, in order to validate against an external schema. Unfortunately, I cannot change some of the items, as this is a heavily intertwined legacy system.

I've read (here on SO) that I should be able to use filters around the input XML, but it doesn't seem to work and feel like I'm missing something. When I run the validation, I get the following error message:

org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'MY_ROOT_ELEMENT'.

Here's the beginning of the XML file with the namespace info:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MY_ROOT_ELEMENT xmlns="http://www.mycompany.net/somename" 
  schemaVersion="2.0.0" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.mycompany.net/somename 
                      myschema.xsd">
    ...
</MY_ROOT_ELEMENT>

Here's the accompanying beginning of the schema where MY_ROOT_ELEMENT is defined:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  elementFormDefault="qualified" 
  attributeFormDefault="unqualified">
    <xs:element name="MY_ROOT_ELEMENT" type="MY_ROOT_ELEMENT"/>

The StreamReaderDelegate which is used to ignore the namespace:

private static final class NoNamespaceStreamReaderDelegate extends StreamReaderDelegate {
    NoNamespaceStreamReaderDelegate(XMLStreamReader reader) {
        super(reader);
    }

    @Override
    public NamespaceContext getNamespaceContext() {
        return super.getNamespaceContext();
    }

    @Override
    public int getNamespaceCount() {
        return 1;
    }

    @Override
    public String getNamespacePrefix(int index) {
        if (index == 0) {
            return "xsi";
        }

        throw new NullPointerException();
    }

    @Override
    public String getNamespaceURI() {
        return null;
    }

    @Override
    public String getNamespaceURI(String prefix) {
        if ("xsi".equals(prefix)) {
            return XMLConstants.W3C_XML_SCHEMA_INSTANCE_NS_URI;
        }
        return null;
    }

    @Override
    public String getNamespaceURI(int index) {
        if (index == 0) {
            return XMLConstants.W3C_XML_SCHEMA_INSTANCE_NS_URI;
        }
        return null;
    }
}

And lastly, how the validation is called:

// reads from the classpath
XMLStreamReader reader = createReaderFromSource();
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(inputSource.getSchema());
Source readerSource = new StAXSource(new NoNamespaceStreamReaderDelegate(reader));

Validator validator = schema.newValidator();
validator.validate(readerSource);

Solution

  • Posting this for posterity's sake.

    Seems like the answer is to set the "IS_NAMESPACE_AWARE" property on the XMLInputFactory to false, and then use the StreamReaderDelegate to make sure the xsi:schemaLocation is ignored. Many wrapper libraries make this part of their API, but for the default StAX implementation, you need to set the property.

    Once I declared the following, and implemented the Delegate, everything worked like a charm.

    XMLInputFactory factory = XMLInputFactory.newInstance();
    factory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, Boolean.TRUE);