Search code examples
javaxmlxsdsaxxerces

Setting ErrorHandler on XML Validator causes incorrect validation


I've run into what seems like some very strange behavior while using Java's XML Validator (which I believe uses the Apache Xerces implementation).

I'm trying to validate some XML documents against an XSD, and I want to log anything that causes a document to be invalid. I figured implementing my own ErrorHandler would allow me to do this. I quickly discovered that this caused XML documents to be erroneously validated (i.e. invalid XML was being identified as valid for my XSD).

I did some testing and found that simply setting the Validator's ErrorHandler to anything was causing this behavior, illustrated below.

validator.validate(invalidXmlSource); // XML correctly identified as INVALID

validator.setErrorHandler(new DefaultHandler());
validator.validate(invalidXmlSource); // XML incorrectly identified as VALID

I would presume that the Validator uses DefaultHandler when one isn't specified, so I don't understand why the behavior is changing.

What is going on here?

Edit

public void validate(File dir, String xsdPath) {
    File schemaFile = new File(xsdPath);
    SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    Schema schema = schemaFactory.newSchema(schemaFile);
    Validator validator = schema.newValidator();
    //validator.setErrorHandler(new DefaultHandler()); <-- this line causes incorrect validation
    for (File xmlFile: dir.listFiles()) {
        try {
            validator.validate(new StreamSource(xmlFile));
            System.out.println("File '" + xmlFile.getName() + "' is valid.");
        } catch (SAXException e) {
            System.out.println("File '" + xmlFile.getName() + "' is NOT valid.");
            System.out.println("Reason: " + e.getLocalizedMessage());
        } catch (IOException e) {
            e.printStackTrace();
        }       
    }
}

Solution

  • DefaultHandler will do nothing for error and warning. It will, however, throw an exception for fatalError. Between the docs for Validator and DefaultHandler, you'll see that, for better or worse, you're getting exactly what you asked for. :)

    Edit: Probably the main thing to note in the Validator docs is that the default (null) error handler will throw an exception for error...

    Edit2: Here's an outline of a possible error handler to do what you want:

    import org.xml.sax.ErrorHandler;
    import org.xml.sax.SAXParseException;
    
    public class LoggingErrorHandler implements ErrorHandler {
    
        private boolean isValid = true;
    
        public boolean isValid() {
            return this.isValid;
        }
    
        @Override
        public void warning(SAXParseException exc) {
            // log info
            // valid or not?
        }
    
        @Override
        public void error(SAXParseException exc) {
            // log info
            this.isValid = false;
        }
    
        @Override
        public void fatalError(SAXParseException exc) throws SAXParseException {
            // log info
            this.isValid = false;
            throw exc;
        }
    }
    

    And it could used like so:

    LoggingErrorHandler errorHandler = new LoggingErrorHandler();
    validator.setErrorHandler(errorHandler);
    validator.validate(invalidXmlSource);
    if (!errorHandler.isValid()) {
        //...
    }