Any <p>
tag within the <body>
tags should be transformed to Body_Text
.
The <p>
tags that have a last ancestor <sec>
without the attribute "sec-type
" should be transformed to Flush_Text
(which overrides the first Body_Text
transformation here).
The <p>
tags that have a last ancestor <sec sec-type="irrelevant-attribute-name>
(with the attribute "sec-type
") should be transformed to Body_Text
.
<sec><p>asdf</p></sec>
should be transformed into <sec><Flush_Text>asdf</Flush_Text></sec>
.
<sec sec-type="whatevs"><p>asdf</p></sec>
should be <sec sec-type="whatevs"><Body_Text>asdf</Body_Text></sec>
.
sec-type
attribute should still be Body_Text
:
<sec sec-type="whatevs"><sec><p>asdf</p></sec></sec>
should be <sec sec-type="whatevs"><sec><Body_Text>asdf</Body_Text></sec>
.
<root>
<body>
<sec sec-type="asdf">
<title>This is an H1</title>
<sec>
<title>This is an H2</title>
<sec>
<title>This is an H3</title>
<p>This SHOULD be "Body_Text", but it's "Flush_Text"</p>
</sec> <!-- end of H3 -->
</sec> <!-- end of H2 -->
</sec> <!-- end of H1 -->
<sec>
<p>This is Flush_Text</p>
</sec>
<p>This is Body_Text</p>
</body>
</root>
...here is my XSL, which is not working correctly:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>
<!-- identity rule -->
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<!-- Body_Text -->
<xsl:template match="body//p">
<Body_Text>
<xsl:apply-templates select="@*|node()"/>
</Body_Text>
</xsl:template>
<!-- Flush_Text -->
<xsl:template match="sec//p">
<xsl:if test="not(@sec-type)">
<Flush_Text>
<xsl:apply-templates select="@*|node()"/>
</Flush_Text>
</xsl:if>
</xsl:template>
<!-- H1 -->
<xsl:template match="sec//title">
<H1>
<xsl:apply-templates select="@*|node()"/>
</H1>
</xsl:template>
<!-- H2 -->
<xsl:template match="sec//sec//title">
<H2>
<xsl:apply-templates select="@*|node()"/>
</H2>
</xsl:template>
<!-- H3 -->
<xsl:template match="sec//sec//sec//title">
<H3>
<xsl:apply-templates select="@*|node()"/>
</H3>
</xsl:template>
</xsl:stylesheet>
...and here is the incorrect output:
<?xml version="1.0" encoding="utf-16"?>
<root>
<body>
<sec sec-type="asdf">
<H1>This is an H1</H1>
<sec>
<H2>This is an H2</H2>
<sec>
<H3>This is an H3</H3>
<Flush_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Flush_Text>
</sec>
<!-- end of H3 -->
</sec>
<!-- end of H2 -->
</sec>
<!-- end of H1 -->
<sec>
<Flush_Text>This is Flush_Text</Flush_Text>
</sec>
<Body_Text>This is Body_Text</Body_Text>
</body>
</root>
Note that the first instance of <p>
in this example should be transformed to Body_Text
, but it is being transformed as Flush_Text
.
Ok, so to produce the wanted results here, I have changed the statement <xsl:template match="sec//p">
(in the XSL under Flush_Text) to <xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">
, and also removed the if statement.
Here is the corrected XSL:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>
<!-- identity rule -->
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<!-- Body_Text -->
<xsl:template match="body//p">
<Body_Text>
<xsl:apply-templates select="@*|node()"/>
</Body_Text>
</xsl:template>
<!-- Flush_Text -->
<xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">
<Flush_Text>
<xsl:apply-templates select="@*|node()"/>
</Flush_Text>
</xsl:template>
<!-- H1 -->
<xsl:template match="sec//title">
<H1>
<xsl:apply-templates select="@*|node()"/>
</H1>
</xsl:template>
<!-- H2 -->
<xsl:template match="sec//sec//title">
<H2>
<xsl:apply-templates select="@*|node()"/>
</H2>
</xsl:template>
<!-- H3 -->
<xsl:template match="sec//sec//sec//title">
<H3>
<xsl:apply-templates select="@*|node()"/>
</H3>
</xsl:template>
</xsl:stylesheet>
...producing this desired output:
<root>
<body>
<sec sec-type="asdf">
<H1>This is an H1</H1>
<sec>
<H2>This is an H2</H2>
<sec>
<H3>This is an H3</H3>
<Body_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Body_Text>
</sec>
</sec>
</sec>
<sec>
<Flush_Text>This is Flush_Text</Flush_Text>
</sec>
<Body_Text>This is Body_Text</Body_Text>
</body>
</root>
this was tested at: http://xslt.online-toolz.com/tools/xslt-transformation.php.
Thanks @Tomalak for pointing me in the right direction in the use of the ancestor
xpath axis.
Here I have matched the last ancestor (what I was incorrectly calling the "highest parent") <sec>
from any <p>
that does not have the attribute sec-type
, and transformating that as Flush_Text
. This is preventing the first instance of <p>
in this example, that has <sec sec-type...
as its' last ancestor, from being Flush_Text
and allows the Body_Text
to override.
Also, I like Tomalak's use of automating H1 - H3... I am still experimenting with this, and don't want to use it until I fully understand it ;)