I'm using the StAX event based API's to modify an XML stream.
The stream represents an HTML document, complete with DTD declaration. I would like to copy this DTD declaration into the output document (written using an XMLEventWriter
).
When I ask the factory to disregard DTD's, it will not download the DTD, but it removes the whole statement and only leaves a "<!DOCUMENTTYPE
" string. When not disregarding, the whole DTD gets downloaded, and included when verbatim outputting the DTD event.
I don't want to use the time to download this DTD, but include the complete DTD specification (resolving entities is already disabled and I don't need that). How can I disable the fetching of external DTD's?
You should be able to implement a custom XMLResolver that redirects attempts to fetch external DTDs to a local resource (if your code parses only a specific doc type, this is often a class resource right in a JAR).
class CustomResolver implements javax.xml.stream.XMLResolver {
public Object resolveEntity(String publicID,
String systemID,
String baseURI,
String namespace)
throws XMLStreamException
{
if ("The public ID you expect".equals(publicID)) {
return getClass().getResourceAsStream("doc.dtd");
} else {
return null;
}
}
Note that some documents only include the "systemID", so you should fall back to checking that. The problem with system identifier is that it's supposed to be "system" specific URL, rather than a well-known, stable URI. In practice, it's often used as if it were a URI though.
See the setXMLResolver method.