Search code examples

How can I determine if an XML node has a last ancestor that does `not` have a certain attribute?

  • Any <p> tag within the <body> tags should be transformed to Body_Text.

  • The <p> tags that have a last ancestor <sec> without the attribute "sec-type" should be transformed to Flush_Text (which overrides the first Body_Text transformation here).

  • The <p> tags that have a last ancestor <sec sec-type="irrelevant-attribute-name> (with the attribute "sec-type") should be transformed to Body_Text.

<sec><p>asdf</p></sec> should be transformed into <sec><Flush_Text>asdf</Flush_Text></sec>.

<sec sec-type="whatevs"><p>asdf</p></sec> should be <sec sec-type="whatevs"><Body_Text>asdf</Body_Text></sec>.

Also, any further nesting into an ancestor with this sec-type attribute should still be Body_Text:

<sec sec-type="whatevs"><sec><p>asdf</p></sec></sec> should be <sec sec-type="whatevs"><sec><Body_Text>asdf</Body_Text></sec>.

Here is my XML:

  <sec sec-type="asdf">
    <title>This is an H1</title>

      <title>This is an H2</title>

        <title>This is an H3</title>
        <p>This SHOULD be "Body_Text", but it's "Flush_Text"</p>
      </sec> <!-- end of H3 -->
    </sec> <!-- end of H2 -->
  </sec> <!-- end of H1 -->

    <p>This is Flush_Text</p>
    <p>This is Body_Text</p>
</root> is my XSL, which is not working correctly:

<xsl:stylesheet version="1.0" xmlns:xsl="">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>

    <!-- identity rule -->
    <xsl:template match="node()|@*">
            <xsl:apply-templates select="node()|@*"/>

        <!-- Body_Text -->
        <xsl:template match="body//p">
                <xsl:apply-templates select="@*|node()"/>

        <!-- Flush_Text -->
        <xsl:template match="sec//p">
          <xsl:if test="not(@sec-type)">
                <xsl:apply-templates select="@*|node()"/>

        <!-- H1 -->
        <xsl:template match="sec//title">
                <xsl:apply-templates select="@*|node()"/>

        <!-- H2 -->
        <xsl:template match="sec//sec//title">
                <xsl:apply-templates select="@*|node()"/>

        <!-- H3 -->
        <xsl:template match="sec//sec//sec//title">
                <xsl:apply-templates select="@*|node()"/>

...and here is the incorrect output:

<?xml version="1.0" encoding="utf-16"?>
        <sec sec-type="asdf">
            <H1>This is an H1</H1>
                <H2>This is an H2</H2>
                    <H3>This is an H3</H3>
                    <Flush_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Flush_Text>
                <!-- end of H3 -->
            <!-- end of H2 -->
        <!-- end of H1 -->
            <Flush_Text>This is Flush_Text</Flush_Text>
        <Body_Text>This is Body_Text</Body_Text>

Note that the first instance of <p> in this example should be transformed to Body_Text, but it is being transformed as Flush_Text.


  • Ok, so to produce the wanted results here, I have changed the statement <xsl:template match="sec//p"> (in the XSL under Flush_Text) to <xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">, and also removed the if statement.

    Here is the corrected XSL:

    <xsl:stylesheet version="1.0" xmlns:xsl="">
    <xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
    <xsl:strip-space elements="*"/>
        <!-- identity rule -->
        <xsl:template match="node()|@*">
                <xsl:apply-templates select="node()|@*"/>
            <!-- Body_Text -->
            <xsl:template match="body//p">
                    <xsl:apply-templates select="@*|node()"/>
        <!-- Flush_Text -->
        <xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">
                <xsl:apply-templates select="@*|node()"/>
            <!-- H1 -->
            <xsl:template match="sec//title">
                    <xsl:apply-templates select="@*|node()"/>
            <!-- H2 -->
            <xsl:template match="sec//sec//title">
                    <xsl:apply-templates select="@*|node()"/>
            <!-- H3 -->
            <xsl:template match="sec//sec//sec//title">
                    <xsl:apply-templates select="@*|node()"/>

    ...producing this desired output:

    <sec sec-type="asdf">
    <H1>This is an H1</H1>
    <H2>This is an H2</H2>
    <H3>This is an H3</H3>
    <Body_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Body_Text>
    <Flush_Text>This is Flush_Text</Flush_Text>
    <Body_Text>This is Body_Text</Body_Text>

    this was tested at:

    Thanks @Tomalak for pointing me in the right direction in the use of the ancestor xpath axis.

    Here I have matched the last ancestor (what I was incorrectly calling the "highest parent") <sec> from any <p> that does not have the attribute sec-type, and transformating that as Flush_Text. This is preventing the first instance of <p> in this example, that has <sec sec-type... as its' last ancestor, from being Flush_Text and allows the Body_Text to override.

    Also, I like Tomalak's use of automating H1 - H3... I am still experimenting with this, and don't want to use it until I fully understand it ;)