Search code examples
xmlxsltxslt-2.0muenchian-grouping

How to group a flat un-categorized list into sub lists using XSLT grouping


I am trying to group the following html content:

input

<?xml version="1.0" encoding="UTF-8"?>
<html>
    <body>
        <h2>Steps for grouping in the Muench method</h2>
        <p class="step">Define a <code>key</code> for the property we want
            to use for grouping.</p>
        <p class="step">Select all of the nodes ...</p>
        <p class="substep">Select all of the nodes ...</p>
        <p class="result">Select all of the nodes ...</p>
        <p class="substep">Select all of the nodes ...</p>
        <p class="step">For each unique grouping value ...</p>
        <h2>Steps for grouping in XSLT 2.0</h2>
        <p class="step">Define an XPath expression ...</p>
        <p class="substep">Select all of the nodes ...</p>
        
        <p class="substep">Instead of dealing with each ...</p>
    </body>
</html>

into the following output

I had taken a look at a couple of answers here before posting it here. They gave an idea as to how to group the following siblings with the target element. For example, when I want to group element with @result inside substep. And each substep inside step.

<body>
        <h2>Steps for grouping in the Muench method</h2>
        <step>
            <p class="step">Define a <code>key</code> for the property we want to use for
                grouping.</p>
        </step>
        <step>
            <p class="step">Select all of the nodes ...</p>
            <substep>
                <p class="substep">Select all of the nodes ...</p>
                <p class="result">Select all of the nodes ...</p>
            </substep>
            <substep>
                <p class="substep">Select all of the nodes ...</p>
            </substep>
        </step>
        <step>
            <p class="step">For each unique grouping value ...</p>
            <h2>Steps for grouping in XSLT 2.0</h2>
        </step>
        <step>
            <p class="step">Define an XPath expression ...</p>
            <substep>
                <p class="substep">Select all of the nodes ...</p>
            </substep>
            <substep>
                <p class="substep">Instead of dealing with each ...</p>
            </substep>
        </step>
    </body>

What I have done so far

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:key name="step" match="p" use="@class"/>
    <xsl:template match="/">
      <body>
        <xsl:variable name="steps">
            <xsl:for-each-group select="html/body/*" group-starting-with="p[@class = 'step']">
                <xsl:choose>
                    <xsl:when test="current-group()[self::p[@class = 'step']]">
                        <step>
                            <xsl:copy-of select="current-group()"/>
                        </step>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="."/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:variable>
        <xsl:apply-templates select="$steps" mode="fix.step"/>
      </body>
    </xsl:template>
    <xsl:template match="node()" mode="fix.step">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="step" mode="fix.step">
        <step>
            <xsl:for-each-group select="*" group-starting-with="p[@class = 'substep']">
                <xsl:choose>
                    <xsl:when test="current-group()[self::p[@class = 'substep']]">
                        <substep>
                            <xsl:copy-of select="current-group()"/>
                        </substep>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="current-group()"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </step>
    </xsl:template>
</xsl:stylesheet>

Questions

Is there a better to achieve the same result?

a few caveats

The given input file is only an example. There could a lot of non @step and @substep elements between a @step and a @substep elements. And they need to be grouped within either step or substep depending on their position. For example, the following could be another variation of the input file:

<?xml version="1.0" encoding="UTF-8"?>
<html>
    <body>
        <h2>Steps for grouping in the Muench method</h2>
        <p class="step">Define a <code>key</code> for the property we want
            to use for grouping.</p>
        <p class="step">Select all of the nodes ...</p>
        <p class="result">Select all of the nodes ...</p>
        <image @href="image.pn"/>
        <p class="substep">Select all of the nodes ...</p>
        <p class="substep">Select all of the nodes ...</p>
        <p class="result">Select all of the nodes ...</p>
        <p class="substep">Select all of the nodes ...</p>
        <p class="step">For each unique grouping value ...</p>
        <h2>Steps for grouping in XSLT 2.0</h2>
        <p class="step">Define an XPath expression ...</p>
        <p class="substep">Select all of the nodes ...</p>
        
        <p class="substep">Instead of dealing with each ...</p>
    </body>
</html>

Rules

  1. When a list item at level 1 has following siblings p/@class=ListContinue or p/@class=ListNote or p/@class=ListBullet2, all such adjacent elements should be wrapped in
  2. as well except where the p/@class = BodyText or p/@class=Note.
  3. When a list item at level 2 has following siblings p/@class=ListContinue2 or p/@class=ListNote, all such adjacent elements should be wrapped in
  4. as well except where the p/@class = BodyText or p/@class=Note or p/@class=ListBullet.

That way, all info related to a particular list item are wrapped within li. The same logic applies to all li irrespective of the depth. It's that I am finding it difficult to achieve.


Solution

  • Couldn't you do simply:

    XSLT 2.0

    <xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" omit-xml-declaration="yes" version="1.0" encoding="utf-8" indent="yes"/>  
    
    <xsl:template match="/html">
        <body>
            <xsl:for-each-group select="body/*" group-starting-with="p[@class='step']">
                <xsl:choose>
                    <xsl:when test="self::p[@class='step']">
                        <step>
                            <xsl:copy-of select="."/>     
                            <xsl:for-each-group select="current-group() except ." group-starting-with="p[@class='substep']">
                                <xsl:choose>
                                    <xsl:when test="self::p[@class='substep']">
                                        <substep>
                                            <xsl:copy-of select="current-group()"/>
                                        </substep> 
                                    </xsl:when>
                                    <xsl:otherwise>
                                        <xsl:copy-of select="current-group()"/>
                                    </xsl:otherwise>
                                </xsl:choose>
                            </xsl:for-each-group>    
                        </step>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:copy-of select="."/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </body>
    </xsl:template>
    
    </xsl:stylesheet>