Search code examples
xmlw3cxml-namespaceslibxml2xmllint

how does w3c canonicalization work for document subsets?


I am unsure if "xmllint --c14n" works correctly regarding namespaces. For the following input, my hand-made legacy implementation of W3C Canonicalization pulls down the namespace decl xmlns:xsi to the Dcoument-Tag.

<?xml version="1.0" encoding="UTF-8"?>
<conxml xmlns="urn:conxml:xsd:container.nnn.002" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:conxml:xsd:container.nnn.002 container.nnn.002.xsd">
  <MsgPain001>
    <Document xmlns="urn:swift:xsd:$pain.001.002.02">
      <pain.001.001.02>
      </pain.001.001.02>
    </Document>
  </MsgPain001>
</conxml>

Actual result according to my legacy implementation:

...
    <Document xmlns="urn:swift:xsd:$pain.001.002.02" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
...

But xmllint --c14n does not do this, it reports

...
    <Document xmlns="urn:swift:xsd:$pain.001.002.02">
...

Can someone explain who is right according to the spec and why?

see for details: http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Example-DocSubsets


Solution

  • According to the spec (Section 4.6):

    Unnecessary namespace declarations are not made in the canonical form.

    The "http://www.w3.org/2001/XMLSchema-instance" namespace is not necessary to represent the document subset because the xsi:schemaLocation attribute is not included and the namespace is therefore "unnecessary".