Search code examples
xpathxslt

Select only the closest ancestor of any node that contains a given string, the ancestor must have a given attribute


Here is some example XML from which I only need to select the closest ancestor to an element that contains a given string, the ancestor must have the @isDoc attribute.

In my below example, I would expect to only get the <example> node in the resulting nodeset if the value of $macroAlias was match.

<container isDoc="">
    <site isDoc="">
        <home isDoc="">
            <page>Some text</page>
            <page>Some text</page>
            <example isDoc="">
                <title>A title</title>
                <body><![CDATA[Some text and a page containing a <p>string to match</p>]]></body>
            </example>
            <page>Some text</page>
        </home>
    </site>
</container>

My current query can be found below. The problem with it is that it selects not only the closest ancestor but all other ancestors above that too. I really only want the ancestor node (with the @isDoc attribute) of the node containing the given string ($macroAlias).

<xsl:if test="$macroAlias != ''">
    <xsl:variable name="nodes"
                  select="//node()[contains(., $macroAlias)][ancestor::*[@isDoc][1]]/parent::*[@isDoc]"/>

        <ul>
            <li>
                <xsl:for-each select="$nodes">
                    <xsl:value-of select="name()"/>
                </xsl:for-each>
            </li>
        </ul>
</xsl:if>

I've tried many different ways to achieve this and either end up with the same result or no results in my output.


Solution

  • The problem with the . in node()[contains(., $macroAlias)] is that it will find any node() (also text()-nodes) that contains (all descendants included) this $macroAlias. See for a explanation the differnce between text() and . i.e. this question.

    So that is true for:

    • /container[@isDoc] and
    • /container[@isDoc]/site[@isDoc] and
    • /container[@isDoc]/site[@isDoc]/home[@isDoc] and
    • /container[@isDoc]/site[@isDoc]/home[@isDoc]/example[@isDoc] and
    • /container[@isDoc]/site[@isDoc]/home[@isDoc]/example[@isDoc]/example[@isDoc]/text()

    If the $macroAlias is in a child-text() of a element with *[@isDoc] (as in your example), I would try this:

    <xsl:variable name="nodes"
                  select="//*[@isDoc][contains(text(), $macroAlias)]"/>
    

    If it $macroAlias is not a direct child of a element with a @isDoc-attribute you could use:

    <xsl:variable name="nodes"
                  select="//text()[contains(., $macroAlias)]/ancestor::*[@isDoc][1]"/>
    

    Base on this xml:

    <container isDoc="">
      <site isDoc="">
        <home isDoc="">
          <page>Some text</page>
          <page>Some text</page>
          <example isDoc="">
            <title>A title</title>
            <body><![CDATA[Some text and a page containing a <p>string to match</p>]]></body>
          </example>
          <page>Some text</page>
          <contact><![CDATA[Some text and a page containing a <p>string to not match</p>]]></contact>
        </home>
      </site>
    </container>
    

    using this xslt:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0" 
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      
      <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
      <xsl:strip-space elements="*"/>
      <xsl:param name="macroAlias" select="'A title'"></xsl:param>
      
      <xsl:template match="/">
        <xsl:if test="$macroAlias != ''">
          <xsl:variable name="nodes" select="//text()[contains(., $macroAlias)]/ancestor-or-self::*[@isDoc][1]"/>
          <ul>
            <li>
              <xsl:for-each select="$nodes">
                <xsl:value-of select="name()"/>
              </xsl:for-each>
            </li>
          </ul>
        </xsl:if>  
      </xsl:template>
      
    </xsl:stylesheet>
    

    Will give this result:

    <?xml version="1.0" encoding="UTF-8"?>
    <ul>
       <li>example</li>
    </ul>