Search code examples
phpxmlxpathsimplexml

Why does node() XPath node type test return duplicates when used with SimpleXMLElement?


I used this examle xml document for practice. And I don't understand node() function in XPath. For example if I write:

$catalog = new SimpleXMLElement("cd.xml",null,true);
$elms = $catalog->xpath("//CD[23]/node()");

print_r($elms);

Result is:

Array
(
    [0] => SimpleXMLElement Object
        (
            [@attributes] => Array
                (
                    [id] => 24
                    [genre] => soul
                )

            [TITLE] => The dock of the bay
            [ARTIST] => Otis Redding
            [COUNTRY] => USA
            [COMPANY] => Stax Records
            [PRICE] => 7.90
            [YEAR] => 1968
        )

    [1] => SimpleXMLElement Object
        (
            [0] => The dock of the bay
        )

    [2] => SimpleXMLElement Object
        (
            [@attributes] => Array
                (
                    [id] => 24
                    [genre] => soul
                )

            [TITLE] => The dock of the bay
            [ARTIST] => Otis Redding
            [COUNTRY] => USA
            [COMPANY] => Stax Records
            [PRICE] => 7.90
            [YEAR] => 1968
        ) ...

and so on, repeating CD[23] element after every child element. I tried many variations of this, and always get duplicates in some way. I don't understand why is that?

I tried editing xml document by removing new lines and spaces, so there are no text nodes in between the elements, and I get the result without duplicates. Is it the case that this was the way of xpath() method to represent those text nodes?


Solution

  • I think part of the problem results from the simplification that SimpleXML involves, the xpath method returns an array of SimpleXMLElements and with your path using node() to select any kind of child node of CD[23] in a pure XPath mapping you would get a node list or set containing text nodes and element nodes, it seems the designers of that API for PHP in the case of a text node selected return its parent element so for any text node child of CD[23] selected you get that same parent element returned in the array. You might want to go with CD[23]/* to select any child elements.