I have switched my zend framework version from 1.11 to 1.12.3 In the tests i detect a strange error that i cannot explain. I have some xml fetch and processing routines that yell at me.
PHP Fatal error: Uncaught exception 'Zend_Dom_Exception' with message
'Invalid XML: Detected use of illegal DOCTYPE' in ....
In zend framework 1.11 i had library/Zend/Dom/Query.php:197:
switch ($type) {
case self::DOC_XML:
$success = $domDoc->loadXML($document);
break;
....
In 1.12 the code looks strange
switch ($type) {
case self::DOC_XML:
$success = $domDoc->loadXML($document);
foreach ($domDoc->childNodes as $child) {
if ($child->nodeType === XML_DOCUMENT_TYPE_NODE) {
require_once 'Zend/Dom/Exception.php';
throw new Zend_Dom_Exception(
'Invalid XML: Detected use of illegal DOCTYPE'
);
}
}
break;
.....
If i get this right, this routine will not parse doc xml with doctype. Little example that fails on my computer all the time:
require_once 'Zend/Dom/Query.php';
$f = '<?xml version="1.0" standalone="yes"?>' .
'<!DOCTYPE hallo [<!ELEMENT hallo (#PCDATA)>]>' .
'<hallo>Hallo Welt!</hallo>';
$dom = new Zend_Dom_Query($f);
$results = $dom->queryXpath('//hallo');
Can someone explain this to me??? I testeted with Zend Framework 1.12.3 and php 5.3.2 and 5.4.6
Ok i had a little talk with Matthew Weier O'Phinney and the reason why DOCTYPES are not accepted anymore. The reason is the security patch here http://framework.zend.com/security/advisory/ZF2012-02
They disabled the doctype feature to prevent XXE and XEE.
"I closed the report because it's something we cannot fix, due to security implications. It doesn't matter if it's valid XML -- XEE and XXE vectors utilize perfectly valid XML in order to exploit issues in the underlying XML parser. Because we cannot control what version of libxml is used in every PHP distribution on which ZF is deployed, we must be defensive in our code. Furthermore, the moment we add a switch to disable the XEE and XXE vector checks, folks will use that switch without understanding the reason behind them.
There are a number of tools you can use to pre-process XML -- including pandoc or the PCRE tools in PHP -- if you cannot control the source of the XML and still want to parse it with our tools."
I've mentioned that this was already fixed by libxml2 itself in 2012. But he argued that they have no idea witch version of libxml2 is used in the special cases.
So what are the solutions?
Thank you Rolando Isidoro for the help :)