Search code examples
cxmllibxml2

Is xml-tag could use two namespaces or this is libxml2 bug?


I have the following valid XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
  <aaa xmlns:de="http://www.dolby.com/dcinema/ws/smi/v11/SPL" atr="abc" xmlns:fe="http://somewhere">
   some text
   <de:bbb atr1="abb" atr2="baa" >aaa</de:bbb>
   <de:ccc>aaa</de:ccc>
   <fe:ddd>bbb</fe:ddd>
   some more text
  </aaa>

And the following C code:

#include <stdio.h>
#include <libxml/xmlreader.h>
#include <libxml/tree.h>

char xml_data[] = {
    "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"
    "  <aaa xmlns:de=\"http://www.dolby.com/dcinema/ws/smi/v11/SPL\" "
    "       atr=\"abc\""
    "       xmlns:fe=\"http://somewhere\">\n"
    "   some text\n"
    "   <de:bbb atr1=\"abb\"  atr2=\"baa\" >aaa</de:bbb>\n"
    "   <de:ccc>aaa</de:ccc>\n"
    "   <fe:ddd>bbb</fe:ddd>\n"
    "   some more text\n"
    "  </aaa>"
};

void printns(xmlNsPtr ns, int deep, char * marker)
{
    while (ns)
    {
        printf("%*c%s+%s\n", deep * 5 + 1, ' ', marker, ns->prefix);
        ns = ns->next;
    }
}

void printelem(xmlNodePtr ptr, int deep)
{
    printf("%*c%s\n", deep * 5, ' ', ptr->name);
    if (ptr->type == XML_ELEMENT_NODE) 
    {
        printns(ptr->nsDef, deep, "d");
        printns(ptr->ns,    deep, "u");
    }

    if (ptr->xmlChildrenNode) printelem(ptr->xmlChildrenNode, deep+1);

    if (ptr->next) printelem(ptr->next, deep);
}

int main(void)
{
    LIBXML_TEST_VERSION
    xmlInitParser();

    xmlDocPtr doc;
    doc = xmlReadDoc(BAD_CAST xml_data, NULL, NULL, XML_PARSE_NOBLANKS);

    printelem(doc->xmlChildrenNode, 1);

    xmlFreeDoc(doc);
}

This produces the following output:

 aaa
  d+de
  d+fe
      text
      bbb
       u+de
       u+fe
           text
      text
      ccc
       u+de
       u+fe
           text
      text
      ddd
       u+fe
           text
      text

As you see, libxml2 says that bbb and ccc have TWO namespaces at once, when ddd have one namespace, as expected. Is this some xml-standart rule which is unknown for me or is this libxml2 bug?


Solution

  • The name of an XML element can obviously only have a single namespace. So you shouldn't think of the ns member in struct _xmlNode as a linked list. It actually points to an nsDef entry of an ancestor element. Use the next pointer only to iterate nsDef. If you change the printf statement in printns to also show the address of the xmlNs struct

    printf("%*c%s+%s [%p]\n", deep * 5 + 1, ' ', marker, ns->prefix,
           (void*)ns);
    

    the output becomes

     aaa
      d+de [0x9e9aff0]
      d+fe [0x9e9b1a0]
          text
          bbb
           u+de [0x9e9aff0]  // same as first entry in nsDef of aaa
           u+fe [0x9e9b1a0]  // should be ignored
               text
          text
          ccc
           u+de [0x9e9aff0]  // same as first entry in nsDef of aaa
           u+fe [0x9e9b1a0]  // should be ignored
               text
          text
          ddd
           u+fe [0x9e9b1a0]  // same as second entry in nsDef of aaa
               text
          text
    

    Note that ns always points to the correct xmlNs of the element.