Search code examples
xmlxpathtreedescendant

xpath /descendant-or-self - Searching for a node in a particular tree


I am reading about the short-cut, '//', which apparently is a shortcut for:

'/descendant-or-self'

it is clear what to expect, say, from a simple example of such an expression, eg,

//myNode

It will return a node list of all instances in the document, found from the root, of elements called 'myNode'.

However, what is the meaning of a more complicated expression, such as:

//aNode//myNode

?

Since // (being the shortcut for '/descendant-or-self') matches the root node twice, does this mean the first part of the expression '//aNode' is redundant, and only adds to the time it takes to complete the execution of the expression (after having still only found all expressions throughout the whole document, of 'myNode') ?

Are '//myNode' and '//aNode//myNode' going to result in exactly the same thing?

Finally, if I was searching through the document for an instance of node 'myNode' which was an indirect descendant of node 'interestingTree'. But I don't want the instance of node 'myNode' which is an indirect descendant of node 'nonInterestingTree', how should I do this ?

for example, searching in the document:

<root>
    <anode>
        <interestingTree>
            <unknownTree>
                <myNode/><!-- I want to find this one, not the other, where I don't know the path indicated by 'unknownTree' -->
            </unknownTree>
        </interestingTree>
        <nonInterestingTree>
            <unknownTree>
                <myNode/>
            </unknownTree>
        </nonInterestingTree>
    </anode>
    <anode>
        <someOtherNode/>
    </anode>
</root>

Thanks!


Solution

  • Are '//myNode' and '//aNode//myNode' going to result in exactly the same thing?

    Yes, in this case, because all myNodes are also descendants of anode. In the general sense however, //aNode//myNode will obviously not match nodes which do not have an anode parent in their ancestor tree.

    The xpath:

    //aNode//myNode
    

    will ignore any intermediate hierarchy between aNode and myNode, i.e. it will match /aNode/myNode, /anyNodes/anode/myNode, and /anyNodes/anode/xyzNode/myNode

    Which answers your last question, you can find the nodes in the interesting subpath like so: (and again, ignoring any intermediate elements in the hierarchy)

    //anode//interestingTree//myNode
    

    ideally of course, you should be as explicit as possible with your pathing, as // can incur performance overhead due to the potentially large number of elements it needs to search.

    Edit Possibly this helps?

    I've adjusted your xml input for clarity to:

    <root>
        <anode>
            <interestingTree>
                <unknownTree>
                    <myNode>
                        MyNode In Interesting Tree
                    </myNode>
                </unknownTree>
            </interestingTree>
            <nonInterestingTree>
                <unknownTree>
                    <myNode>
                        MyNode In Non-Interesting Tree
                    </myNode>
                </unknownTree>
            </nonInterestingTree>
        </anode>
        <anode>
            <someOtherNode/>
        </anode>
        <bnode>
            <myNode>
                MyNode in BNode
            </myNode>
        </bnode>
    </root>
    

    When parsed through the stylesheet:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="xml" encoding="UTF-8" indent="yes"/>
        <xsl:template match="/">
            Matched by `//myNode`
            <xsl:apply-templates select="//myNode">
            </xsl:apply-templates>
    
            Matched by `//aNode//myNode`
            <xsl:apply-templates select="//anode//myNode">
            </xsl:apply-templates>
    
            Matched by `//aNode//interestingTree//myNode`
            <xsl:apply-templates select="//anode//interestingTree//myNode">
            </xsl:apply-templates>
        </xsl:template>
    
        <xsl:template match="myNode">
            <xsl:value-of select="text()"/>
        </xsl:template>
    </xsl:stylesheet>
    

    Returns the following:

    Matched by `//myNode`
            MyNode In Interesting Tree
            MyNode In Non-Interesting Tree
        MyNode in BNode
    
    Matched by `//aNode//myNode`
            MyNode In Interesting Tree
            MyNode In Non-Interesting Tree
    
    Matched by `//aNode//interestingTree//myNode`
            MyNode In Interesting Tree