I use the non-validating read for displaying or processing un-trusted XML documents where I do not need support for internal entities but I do want to be able to process then even if a DOCTYPE is shown.
With the disallow DOCTYPE-decl feature of SAX I can make sure parsing a XML document has no risk of external entities or billion laughter DOS expansions. This is also recommended by the OWASP XXE prevention cheat-sheet.
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setFeature("http://apache.org/xml/features/continue-after-fatal-error", true);
reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// or
reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
However unfortunately this aborts the parsing when a DOCTYPE is given:
org.xml.sax.SAXParseException; systemId: file:... ; lineNumber: 2; columnNumber: 10;
DOCTYPE is disallowed when the
feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
And if I ignore this fatal error, then it will happily resolve internal entities, as you can see here: https://gist.github.com/ecki/f84d53a58c48b13425a270439d4ed84a
I wonder, is there a combination of features so I can read over but not evaluate the doctype declaration (especially avoiding recursive expansion).
I am looking to avoid defining my own Apache specific security-manager property or a special resolver.
According to core-lib-dev the XMLReaderFactory
will be deprecated in Java 9 and the way to obtain a XMLReader
will be to use a SAX Parser.
In that case FSP can be used (which esablishes some resource limits as well as removes remote schema handlers for ACCESS_EXTERNAL_DTD
SAXParserFactory spf = SAXParserFactory.newInstance();
// when FSP is activated explicit it will also restrict external entities
spf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
XMLReader reader = spf.newSAXParser().getXMLReader();