Search code examples
phpdomdocumentxsd-validation

PHP xmllib XSD validation namespace error: Did not expect X, expected X


I've been trying to get xmllib to cooperate and validate a xmldsig <Signature> element via DOMDocument, yet it keeps producing spurious errors no matter how I throw the element at it. Apparently having the <Signature> be the root is a no-go due to DOMDocument. I can't manage to set the xmlns attribute on the root properly: Trying to validate results in:

Element 'Signature': No matching global declaration available for the validation root. 

I think It's maybe because the schema is intended to be 'included', you would want to sign 'something' after all, not just have signature by itself. So, by creating this:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:ds="http://www.w3.org/2000/09/xmldsig#"
    version="1.0"
    elementFormDefault="qualified">
<xs:import namespace="http://www.w3.org/2000/09/xmldsig#" schemaLocation="xmldsig-core-schema.xsd" />
<xs:element name="root">
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="ds:Signature" />
        </xs:sequence>
    </xs:complexType>
</xs:element>
</xs:schema>

I can now have an included signature for a dummy element.

I realize PHP XML processing is rather bad, none of the tools really work all that well. However, once it actually hands things off to xmllint properly, there should be no issue doing this. Other answers suggest adding in XPath as well, but I'd like to keep things KISS; use as little PHP libraries as possible.

With a couple hundred lines of code finagling I got it to cooperate and wrap just the <Signature> part of the xml file into a plain test file, inside a <root> element.

I think it's doing something to corrupt the namespaces in the document. Calling $myDocument->saveXML(); after it fails $myDocument->schemaValidate(); returns a peculiar result:

Element 'Signature': This element is not expected. Expected is ( {http://www.w3.org/2000/09/xmldsig#}Signature ).

When the xml looks like this (I've added the formatting; the real thing is unformatted; though the whitespace should not be significant):

<?xml version="1.0"?>
<root>
    <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
         (... omitted crypto ...)
    </Signature>
</root>

From the reading I've done on it, the error should mean that the namespace is wrong on the <Signature> element. But here it's clearly explicitly designated with the xmlns to be exactly what the error message is expecting it to be. So, what's wrong here?

Addendum: Code to prepare the signature for verification (in an arbitrary document) would look something like this. This is the complicated version where I've added a 'root' element instead of using the schema directly, which, see my answer below, turned out to be unnecessary, although a good test, as in more complicated scenarios where you'd want to validate a signed thing in one call to libxml it's a valid thing to want to do:

function verifySchema(\DOMDocument $doc) {        
    libxml_use_internal_errors(true);
    $new = new \DOMDocument; 
    $newRoot = $new->createElement("root");
    $root = $doc->documentElement;     
    foreach($root->childNodes as $node) {   
        if($node->nodeName == "Signature") {
            $signature = $new->createElement("Signature");
            $signature->setAttributeNS('http://www.w3.org/2000/xmlns/','xmlns','http://www.w3.org/2000/09/xmldsig#');
            foreach($node->childNodes as $subnode) {
                $signature->appendChild($new->importNode($subnode, true));
            }
            $newRoot->appendChild($signature);
            break;
        }
    }           
    $new->appendChild($newRoot);
    if(!$signature) {
        throw new \Exception("Document is not signed: No signature node found.");
    }
    if(!$new->schemaValidate("xmldsig-test.xsd")) {
       // Error handling code here.
   } else {
       return true;
   }
}

Solution

  • PHP is the problem. It's subtly buggy and won't/doesn't understand XML namespaces properly in programmatically created documents.

    There's various ways of adding/editing DOMDocument elements, and some/most of these can lead to an element E which is in some namespace X when you just look at the XML output, but where the DOMDocument object considers it to be in namespace Y. Even if you only use the provided functions and don't access the object's internals it's possible to put them in an inconsistent state. In some cases, if you want the namespaces in a specific format (due to canonicalization issues) even unavoidable.

    See:

    https://bugs.php.net/bug.php?id=78352

    Replace the line of code

    $myDocument->schemaValidate();
    

    with

    $myDocument->loadXML($myDocument->saveXML());
    $myDocument->schemaValidate();
    

    and the validation error should be gone. Either way (using a custom xsd with include or using the xmldsig validation file directly) should work.

    While the bug is seen as 'not a bug', it really is a problem, as DOMDOcument won't actually output the exact syntax using a xmlns attribute but use the prefix notation instead if you use createElementNS(). (While syntactically perhaps the same, nobody understands xml namespaces so people accepting your XML might want it exactly like the example, i.e.: using the xmlns, not using ds:Signature elements).

    Instead using DOMElement::setAttributeNS($currentNS, 'xmlns', $subElementNS) does output a default namespace tag, but does not tag the programmatic entities under the element correctly recursively with the new namespace.