Search code examples
javaxmltomcatdtdsaxparser

Handle private external DTD with dependencies in SAX parser


I'm trying to parse XML file with private external DTD specified in DOCTYPE like this:

<!DOCTYPE MY1 SYSTEM "my1.dtd">

To hanlde this DTD locally for validation I specifed EntityResolver for XMLReader parser:

        //use local DTD
        parser.setEntityResolver(new EntityResolver() {
            @Override
            public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
                if (systemId.contains(my1Dtd)) {
                    return new InputSource(MyClass.class.getResourceAsStream(MY1_DTD_RESOURCE_PATH));
                } else {
                    return null;
                }
            }
        });

This InputSource returned correctly, but there is a problem related to DTD: Inside DTD there are references to another DTDs. So I put all DTDs in the same package. But when my application deployed on Tomcat FileNotFoundException D:\apache-tomcat-6.0.29\bin\my2.dtd (The system cannot find the file specified) was thrown.

My question is: how can we specify this dependency correctly? Should it be constructed in resolveEntity method or I made a mistake in path (my2.dtd declared as <!ENTITY % MY2 SYSTEM "my2.dtd"> inside my1.dtd and stored in the same package).


Solution

  • The resolveEntity should be invoked as well when the parser needs to load the my2.dtd file.

    Thus ou need to modify it in this similar way:

     public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
                if (systemId.contains(my1Dtd)) {
                    return new InputSource(MyClass.class.getResourceAsStream(MY1_DTD_RESOURCE_PATH));
                } else if (systemId.contains(my2Dtd)) {
                    return new InputSource(MyClass.class.getResourceAsStream(MY2_DTD_RESOURCE_PATH));
                } else {
                    return null;
                }
            }
    

    However, to avoid such work you should consider using a resolver, like the Apache resolver. This resolver relies on the OASIS entity resolution / XML catalogs, which allows you to create a catalog in XML format, read by the resolver so that you won't need to modify your code each time you have a new DTD or move it to another place or whatever. (This resolver package comes bundled with the Apache Xerces distribution, if it already is the parser you are using).