Search code examples
javaxmltestingxmlunit-2

How to disable XMLUnit DTD validation?


I am trying to compare two XHTML documents using XMLUnit 2.2.0. However, it is taking too long. I guess the library is downloading DTD files from Internet.

How can I disable DTD validation? I am using the following test code:

public class Main {
    public static void main(String args[]) {
        Diff d = DiffBuilder.compare(
                Input.fromString(
                     "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \n"
                    +"     \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n"
                    +"<html xmlns=\"http://www.w3.org/1999/xhtml\">\n"
                    +"     <head></head>\n"
                    +"     <body>some content 1</body>\n"
                    +"</html>")).withTest(
                Input.fromString(                   
                     "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \n"
                    +"     \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n"
                    +"<html xmlns=\"http://www.w3.org/1999/xhtml\">\n"
                    +"     <head></head>\n"
                    +"     <body>some content 2</body>\n"
                    +"</html>")).ignoreWhitespace().build();
        if(d.hasDifferences()) 
            for (Difference dd: d.getDifferences()) {
                System.out.println(dd.toString());
            }
    }
}

Reading the XMLUnit Javadoc of DiffBuilder.withDocumentBuilderFactory(), I thought I could do this setting a document builder factory like this...

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

dbf.setValidating(false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

Diff d = DiffBuilder.compare(Input.fromString(...)).withTest(
     Input.fromString(...)).withDocumentBuilderFactory(dbf)
          .ignoreWhitespace().build();

It did not work. My code runs fast just when I remove the DOCTYPE definition from the XHTML snippets.


Solution

  • withDocumentBuilderFactory is exactly what you want to use but unfortunately ignoreWhitespace defeats it.

    Under the covers DiffBuilder creates a WhitespaceStrippedSource which creates a DOM Document without using the DocumentBuilderFactory you've configured. This is a bug. Do you want to create an issue for this?

    A workaround using XMLUnit 2.2.0 is to create the Documents yourself, something like

    Document control = Convert.toDocument(Input.fromString(...).build(), dbf);
    Document test = ...
    Diff d = DiffBuilder.compare(Input.fromDocument(control))
                 .withTest(Input.fromDocument(test))
                 .ignoreWhitespace().build();
    

    Edit: The bug has been fixed in XMLUnit 2.2.1 and the code of the question should now work without any changes.