Search code examples
xmlxpathxslt-2.0xslt-grouping

How to match sub-level elements for grouping without dropping siblings?


I'm just not getting anywhere here. I want to group adjacent w:p elements. Otherwise the document should not be changed any further. In a first XSLT transformation I do an identity transformation in which I create the groups with for-each-group and group-adjacent. But the sibling elements (w:tbl, w:bdr) are dropped - which I don't want. How can I create the groups without dropping the sibling elements? I have already tried several ways: to include the wx:sub-section elements with another layer, which include w:p elements with another layer - to separate them from the sibling elements in the template match of the for-each-group. Without success. What helped me a bit was to search for several patterns with the template match of the for-each-group (and then in the group-adjacent as well). At the end, however, parts of the document were always dropped.

My source XML (simplified)

<wx:sect>
   <w:p val='1'>...</w:p>
   <w:p val='1'>...</w:p>
   <wx:sub-section>
      <w:p val='1'>...</w:p>
      <w:p val='1'>...</w:p>
      <w:tbl>...<w:tbl>
      <w:bdr>...</w:bdr>
      <w:p val='2'>...</w:p>
      <w:p val='2'>...</w:p>
      <w:bdr>...</w:bdr>
      <w:p val='1'>...</w:p>
      <w:p val='1'>...</w:p>
      <w:p val='3'>...</w:p>
      <w:p val='3'>...</w:p>
         <wx:sub-section>
            same structure one step down
            <wx:sub-section>
               same structure one step down (and so forth up to 5 steps)
            </wx:sub-section>
         </wx:sub-section>
    </wx:sub-section>
</wx:sect>

My Stylesheet (xslt 2.0)

I know that with //wx:sect/wx:sub-section I only use the first layer of wx:sub-section (post it anyway, for a better overview). So far I used //wx:sect/wx:sub-section[w:p and not(wx:sub-section)] to capture the other layers, but that's not correct, because they also fall out. Another possibility is to match the layers individually (//wx:sect/wx:sub-section/wx:sub-section ...). That doesn't seem right either.

<!-- Identity Transformation -->
        <xsl:template match="node() | @*">
            <xsl:copy>
                <xsl:apply-templates select="node() | @*"/>
            </xsl:copy>
        </xsl:template>

<xsl:template match="/wx:sect/wx:sub-section">
        <xsl:for-each-group select="w:p"
            group-adjacent="@w:val">
            <xsl:choose>
                <xsl:when test="current-grouping-key() = '1">
                    <div class="wrap1">
                        <xsl:copy-of select="current-group()"/>
                    </div>
                </xsl:when>
                <xsl:when test="current-grouping-key() = '2'">
                    <div class="wrap2">
                        <xsl:copy-of select="current-group()"/>
                    </div>
                </xsl:when>
                ...
                <xsl:otherwise>
                    <xsl:copy-of select="current-group()"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each-group>
    </xsl:template>

Wanted result

<wx:sect>
   <wrapper1>
     <w:p val='1'>...</w:p>
     <w:p val='1'>...</w:p>
   </wrapper1>
   <wx:sub-section>
     <wrapper1>
        <w:p val='1'>...</w:p>
        <w:p val='1'>...</w:p>
     </wrapper1>
     <w:tbl>...<w:tbl>
     <w:bdr>...</w:bdr>
     <wrapper2>
        <w:p val='2'>...</w:p>
        <w:p val='2'>...</w:p>
     </wrapper2>
     <w:bdr>...</w:bdr>
     <wrapper1>
        <w:p val='1'>...</w:p>
        <w:p val='1'>...</w:p>
     </wrapper1>
     <wrapper3>
        <w:p val='3'>...</w:p>
        <w:p val='3'>...</w:p>
     </wrapper3>
        <wx:sub-section>
           same structure
           <wx:sub-section>
              same structure (up to 5 steps)
           </wx:sub-section>
        </wx:sub-section>
   </wx:sub-section>
</wx:sect>

Solution

  • The shortest approach I have been able to come up with is https://xsltfiddle.liberty-development.net/bFDb2Cz, it uses XSLT 3 with a composite grouping key to test both the adjacency of w:p elements and its @val value in one grouping:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:w="http://example.com/w"
        exclude-result-prefixes="xs"
        version="3.0">
    
      <xsl:output indent="yes"/>
    
      <xsl:mode on-no-match="shallow-copy"/>
    
      <xsl:template match="*[w:p[@val]]">
          <xsl:copy>
              <xsl:for-each-group select="*" composite="yes" group-adjacent="boolean(self::w:p), @val">
                  <xsl:choose>
                      <xsl:when test="current-grouping-key()[1]">
                          <div class="wrapper{current-grouping-key()[2]}">
                              <xsl:apply-templates select="current-group()"/>
                          </div>
                      </xsl:when>
                      <xsl:otherwise>
                          <xsl:apply-templates select="current-group()"/>
                      </xsl:otherwise>
                  </xsl:choose>
              </xsl:for-each-group>
          </xsl:copy>
      </xsl:template>
    
    </xsl:stylesheet>
    

    The <xsl:mode on-no-match="shallow-copy"/> is just an XSLT 3 declarative way to say you want to use the identity transformation.

    If you can't move to XSLT 3 in XSLT 2 you either need to nest two xsl:for-each (first check group-adjacent="boolean(self::w:p)", inside you then for a true grouping key you use xsl:for-each-group select="current-group()" group-adjacent="@val" or apply-templates for the other elements) or you would need to some concat the two values e.g. group-adjacent="concat((boolean(self::w:p), '|', @val))" although this is a bit ugly then inside to check and extract the two different values.

    XSLT 2 is at http://xsltransform.hikmatu.com/gWcDMey/1 and with nested grouping does

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:w="http://example.com/w"
        exclude-result-prefixes="xs"
        version="2.0">
    
      <xsl:output indent="yes"/>
    
      <xsl:template match="@* | node()">
          <xsl:copy>
              <xsl:apply-templates select="@* | node()"/>
          </xsl:copy>
      </xsl:template>
    
      <xsl:template match="*[w:p[@val]]">
          <xsl:copy>
              <xsl:for-each-group select="*" group-adjacent="boolean(self::w:p)">
                  <xsl:choose>
                      <xsl:when test="current-grouping-key()">
                          <xsl:for-each-group select="current-group()" group-adjacent="@val">
                              <div class="wrapper{current-grouping-key()}">
                                  <xsl:apply-templates select="current-group()"/>
                              </div>                          
                          </xsl:for-each-group>
                      </xsl:when>
                      <xsl:otherwise>
                          <xsl:apply-templates select="current-group()"/>
                      </xsl:otherwise>
                  </xsl:choose>
              </xsl:for-each-group>
          </xsl:copy>
      </xsl:template>
    
    </xsl:stylesheet>