Using XSLT 2.0, suppose you have current-group()
= { A, X, B, B, X } where A, B, and X are elements. What is an efficient and legible way to split it on the first occurrence of B to get two sequences S1 and S2 such that S1 = { A, X } and S2 = { B, B, X }? Is it possible to accomplish this using a xsl:for-each-group construct?
EDIT: The elements of the current-group()
are not guaranteed to be siblings but are guaranteed to be in document order.
First attempt: Using xsl:for-each-group with group-starting-with
<xsl:for-each-group select="current-group()" group-starting-with="B[1]">
<xsl:choose>
<xsl:when test="position() = 1">
<!-- S1 := current-group() -->
</xsl:when>
<xsl:otherwise>
<!-- S2 := current-group() -->
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
This works provided there is no preceding sibling B to the first B of the current-group().
I would have thought the position predicate [1]
would be scoped to the select clause since current-group()[self::B][1]
returns the correct B. I'm curious to know why it doesn't scope this way.
XML
<root>
<A>A1</A>
<B>B1-1</B>
<B>B1-2</B>
<A>A2</A>
<B>B2-1</B>
<B>B2-2</B>
</root>
XSLT
<xsl:template match="root">
<xsl:copy>
<xsl:for-each-group select="*" group-starting-with="A">
<xsl:for-each-group select="current-group()" group-starting-with="B[1]">
<xsl:choose>
<xsl:when test="position() = 1">
<S1><xsl:copy-of select="current-group()" /></S1>
</xsl:when>
<xsl:otherwise>
<S2><xsl:copy-of select="current-group()" /></S2>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
Result
<root>
<S1>
<A>A1</A>
</S1>
<S2>
<B>B1-1</B>
<B>B1-2</B>
</S2>
<S1>
<A>A2</A>
<B>B2-1</B>
<B>B2-2</B>
</S1>
</root>
As you can see the first group is correctly split, but the second group is not. This will work, however, if you wrap the current-group() in a parent and then pass that to the select clause, but that seems inefficient.
The functx
library defines a functions functx:index-of-node
(http://www.xsltfunctions.com/xsl/functx_index-of-node.html):
<xsl:function name="functx:index-of-node" as="xs:integer*"
xmlns:functx="http://www.functx.com">
<xsl:param name="nodes" as="node()*"/>
<xsl:param name="nodeToFind" as="node()"/>
<xsl:sequence select="
for $seq in (1 to count($nodes))
return $seq[$nodes[$seq] is $nodeToFind]
"/>
</xsl:function>
That would reduce your second approach to
<xsl:template match="root">
<xsl:copy>
<xsl:for-each-group select="*" group-starting-with="A">
<xsl:variable name="pos" select="functx:index-of-node(current-group(), (current-group()[self::B])[1])"/>
<S1>
<xsl:copy-of select="current-group()[position() lt $pos]"/>
</S1>
<S2>
<xsl:copy-of select="current-group()[position() ge $pos]"/>
</S2>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
In the "new" "XSLT 4" world of Saxon 10 PE or EE with the extension functions saxon:items-before
and saxon:items-from
and syntax extension for anonymous functions you could write it as
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="#all" version="3.0">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:template match="root">
<xsl:copy>
<xsl:for-each-group select="*" group-starting-with="A">
<S1>
<xsl:apply-templates
select="saxon:items-before(current-group(), .{ . instance of element(B) })"/>
</S1>
<S2>
<xsl:apply-templates
select="saxon:items-from(current-group(), .{ . instance of element(B) })"/>
</S2>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>