Search code examples
xmlxpathxpath-2.0descendant

How to determine the nesting level in XPath?


In the following example, I would like to determine the "nesting level" of a node with an XPath (2.0) expression. This "nesting level" would be the number of consecutive descendants, e.g. if "span/span/span" exists, it would be 3. The expected nesting levels are given in comments:

<?xml version="1.0" encoding="UTF-8"?>
<text>
    <div>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget
        dolor. Aenean massa.
        <span><!--nesting level:2-->Cum sociis natoque penatibus et magnis dis parturient montes,
            nascetur ridiculus mus.
            <span><!--nesting levels:1-->Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem.
                <span><!--nesting levels:0-->Nulla consequat massa quis enim.</span>
            </span>
            <span><!--nesting levels:0-->Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.</span>
            In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo.
        </span>
        <span><!--nesting levels:0-->Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus
            elementum semper nisi.
        </span>
        <span><!--nesting levels:0-->Aenean vulputate eleifend tellus. Aenean leo ligula,
            porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra
            quis, feugiat a, tellus.
        </span>
    </div>
    <div>Phasellus viverra nulla ut metus varius laoreet.
        <span><!--nesting levels:0-->Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue.
        </span>
        <span><!--nesting levels:2-->Curabitur ullamcorper ultricies nisi.
            <span><!--nesting levels:0-->Nam eget dui.</span>
            Etiam rhoncus.
            <span><!--nesting levels:1-->Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet
                adipiscing sem neque sed ipsum.
                <span><!--nesting levels:0-->Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem.</span>
                <span><!--nesting levels:0-->Maecenas nec odio et ante tincidunt tempus.</span>
                Donec vitae sapien ut libero venenatis faucibus.
                <span><!--nesting levels:0-->Nullam quis ante.</span>
            </span>
            Etiam sit amet orci eget eros faucibus tincidunt. Duis leo. Sed fringilla mauris sit amet
            nibh.
        </span>
        Donec sodales sagittis magna.
    </div>
</text>

Now, I tried count(descendant::span)), but obviously, this will include also any siblings and produce a wrong result in many cases. I also tried count(descendant::span[1])) and count(descendant::span[position() = 1])), which also gave erraneous results. I could not yet figure out how to exclude the number of siblings from the total. Any hint is appreciated.


Solution

  • Within XSLT I get the right values with the expression

                max(
                  for $leaf in descendant-or-self::span[not(span)]
                  return count($leaf/ancestor-or-self::span except ancestor-or-self::span)
                )
    

    for example with the stylesheet

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="xs"
        version="2.0">
    
        <xsl:template match="@* | node()">
            <xsl:copy>
                <xsl:apply-templates select="@* | node()"/>
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match="span">
            <xsl:copy>
                <xsl:attribute name="nesting-level"
                    select=" 
                    max(
                      for $leaf in descendant-or-self::span[not(span)]
                      return count($leaf/ancestor-or-self::span except ancestor-or-self::span)
                    )"/>
                <xsl:apply-templates select="@* | node()"/>
            </xsl:copy>
        </xsl:template>
    
    </xsl:stylesheet>
    

    I get the output

    <?xml version="1.0" encoding="UTF-8"?><text>
        <div>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget
            dolor. Aenean massa.
            <span nesting-level="2"><!--nesting level:2-->Cum sociis natoque penatibus et magnis dis parturient montes,
                nascetur ridiculus mus.
                <span nesting-level="1"><!--nesting levels:1-->Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem.
                    <span nesting-level="0"><!--nesting levels:0-->Nulla consequat massa quis enim.</span>
                </span>
                <span nesting-level="0"><!--nesting levels:0-->Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.</span>
                In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo.
            </span>
            <span nesting-level="0"><!--nesting levels:0-->Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus
                elementum semper nisi.
            </span>
            <span nesting-level="0"><!--nesting levels:0-->Aenean vulputate eleifend tellus. Aenean leo ligula,
                porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra
                quis, feugiat a, tellus.
            </span>
        </div>
        <div>Phasellus viverra nulla ut metus varius laoreet.
            <span nesting-level="0"><!--nesting levels:0-->Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue.
            </span>
            <span nesting-level="2"><!--nesting levels:2-->Curabitur ullamcorper ultricies nisi.
                <span nesting-level="0"><!--nesting levels:0-->Nam eget dui.</span>
                Etiam rhoncus.
                <span nesting-level="1"><!--nesting levels:1-->Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet
                    adipiscing sem neque sed ipsum.
                    <span nesting-level="0"><!--nesting levels:0-->Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem.</span>
                    <span nesting-level="0"><!--nesting levels:0-->Maecenas nec odio et ante tincidunt tempus.</span>
                    Donec vitae sapien ut libero venenatis faucibus.
                    <span nesting-level="0"><!--nesting levels:0-->Nullam quis ante.</span>
                </span>
                Etiam sit amet orci eget eros faucibus tincidunt. Duis leo. Sed fringilla mauris sit amet
                nibh.
            </span>
            Donec sodales sagittis magna.
        </div>
    </text>