Search code examples
xsltxslt-2.0xslt-grouping

XSLT 2.0 Splitting current-group() by first occurence of an element


Using XSLT 2.0, suppose you have current-group() = { A, X, B, B, X } where A, B, and X are elements. What is an efficient and legible way to split it on the first occurrence of B to get two sequences S1 and S2 such that S1 = { A, X } and S2 = { B, B, X }? Is it possible to accomplish this using a xsl:for-each-group construct?

EDIT: The elements of the current-group() are not guaranteed to be siblings but are guaranteed to be in document order.


First attempt: Using xsl:for-each-group with group-starting-with

<xsl:for-each-group select="current-group()" group-starting-with="B[1]">
  <xsl:choose>
    <xsl:when test="position() = 1">
      <!-- S1 := current-group() -->
    </xsl:when>
    <xsl:otherwise>
      <!-- S2 := current-group() -->
    </xsl:otherwise>
  </xsl:choose>
</xsl:for-each-group>

This works provided there is no preceding sibling B to the first B of the current-group(). I would have thought the position predicate [1] would be scoped to the select clause since current-group()[self::B][1] returns the correct B. I'm curious to know why it doesn't scope this way.

XML

<root>
  <A>A1</A>
  <B>B1-1</B>
  <B>B1-2</B>
  <A>A2</A>
  <B>B2-1</B>
  <B>B2-2</B>
</root>

XSLT

<xsl:template match="root">
  <xsl:copy>
    <xsl:for-each-group select="*" group-starting-with="A">
      <xsl:for-each-group select="current-group()" group-starting-with="B[1]">
        <xsl:choose>
          <xsl:when test="position() = 1">
            <S1><xsl:copy-of select="current-group()" /></S1>
          </xsl:when>
          <xsl:otherwise>
            <S2><xsl:copy-of select="current-group()" /></S2>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each-group>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

Result

<root>
  <S1>
    <A>A1</A>
  </S1>
  <S2>
    <B>B1-1</B>
    <B>B1-2</B>
  </S2>
  <S1>
    <A>A2</A>
    <B>B2-1</B>
    <B>B2-2</B>
  </S1>
</root>

As you can see the first group is correctly split, but the second group is not. This will work, however, if you wrap the current-group() in a parent and then pass that to the select clause, but that seems inefficient.


Solution

  • The functx library defines a functions functx:index-of-node (http://www.xsltfunctions.com/xsl/functx_index-of-node.html):

    <xsl:function name="functx:index-of-node" as="xs:integer*"
                  xmlns:functx="http://www.functx.com">
      <xsl:param name="nodes" as="node()*"/>
      <xsl:param name="nodeToFind" as="node()"/>
    
      <xsl:sequence select="
      for $seq in (1 to count($nodes))
      return $seq[$nodes[$seq] is $nodeToFind]
     "/>
    
    </xsl:function>
    

    That would reduce your second approach to

    <xsl:template match="root">
      <xsl:copy>
        <xsl:for-each-group select="*" group-starting-with="A">
            <xsl:variable name="pos" select="functx:index-of-node(current-group(), (current-group()[self::B])[1])"/>
            <S1>
                <xsl:copy-of select="current-group()[position() lt $pos]"/>
            </S1>
            <S2>
                <xsl:copy-of select="current-group()[position() ge $pos]"/>
            </S2>
        </xsl:for-each-group>
      </xsl:copy>
    </xsl:template>
    

    In the "new" "XSLT 4" world of Saxon 10 PE or EE with the extension functions saxon:items-before and saxon:items-from and syntax extension for anonymous functions you could write it as

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon="http://saxon.sf.net/"
        exclude-result-prefixes="#all" version="3.0">
    
        <xsl:mode on-no-match="shallow-copy"/>
    
        <xsl:output indent="yes"/>
    
        <xsl:template match="root">
            <xsl:copy>
                <xsl:for-each-group select="*" group-starting-with="A">
                    <S1>
                        <xsl:apply-templates
                            select="saxon:items-before(current-group(), .{ . instance of element(B) })"/>
                    </S1>
                    <S2>
                        <xsl:apply-templates
                            select="saxon:items-from(current-group(), .{ . instance of element(B) })"/>
                    </S2>
                </xsl:for-each-group>
            </xsl:copy>
        </xsl:template>
    
    </xsl:stylesheet>