Search code examples
xmlxsltxslt-groupingmuenchian-grouping

Grouping substrings with XSLT


I'm new to xsl transforms and having trouble with grouping substrings. I have some xml like the following:

<?xml version="1.0" encoding="UTF-8"?>
<document-root>
  <classes>
    <class1>CATSryverty</class1>
    <class1>CATSt6vvy</class1>
    <class1>CATS4yv6v</class1>
    <class1>DOGSrybytb</class1>
    <class1>DOGSbu6b</class1>
    <class1>DOGS5u57756</class1>
  </classes>
</document-root>

and this xsl:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
    <docs>
        <xsl:for-each select="document-root/classes">
            <xsl:element name="classesCSV">
                <xsl:for-each select="class1/text()">
                    <xsl:value-of select="substring(., 1, 4)"/>
                    <xsl:if test="not(position() = last())">,</xsl:if>
                </xsl:for-each>
            </xsl:element>
        </xsl:for-each>
    </docs>
</xsl:template>
</xsl:stylesheet>

And that gets me this:

<?xml version="1.0" encoding="UTF-8"?>
<docs>
 <classesCSV>CATS,CATS,CATS,DOGS,DOGS,DOGS</classesCSV>
</docs>

But what I'd like is this:

<?xml version="1.0" encoding="UTF-8"?>
<docs>
 <classesCSV>CATS,DOGS</classesCSV>
</docs>

How should I change it?


Solution

  • XSLT 1.0:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>
    <xsl:key name="kDistinct" match="class1" use="substring(., 1,4)"/>
    <xsl:template match="/">
     <docs>
        <xsl:for-each select="document-root/classes">
           <xsl:element name="classesCSV">
              <xsl:for-each select="class1[generate-id() = 
                              generate-id(key('kDistinct', substring(.,1,4))[1])]">
                 <xsl:value-of select="substring(.,1,4)"/>
                 <xsl:if test="not(position() = last())">,</xsl:if>
               </xsl:for-each>
           </xsl:element>
        </xsl:for-each>
     </docs>
    </xsl:template>
    </xsl:stylesheet>
    

    Output:

    <?xml version="1.0" encoding="UTF-8"?>
    <docs>
      <classesCSV>CATS,DOGS</classesCSV>
    </docs>
    

    XSLT 2.0:

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>
     <xsl:template match="/">
      <docs>
       <xsl:element name="classesCSV">
          <xsl:for-each-group select="//class1" group-by="substring(., 1,4)">
             <xsl:value-of select="current-grouping-key()"/>
             <xsl:if test="not(position() = last())">,</xsl:if>
          </xsl:for-each-group>
       </xsl:element>
      </docs>
     </xsl:template>
    </xsl:stylesheet>