Search code examples
xmlxsltxslt-grouping

XSLT : for-each-group to turn a set of xml nodes into concurrent variables


I'm struggling with the following transformation.

My source is an XML for which I don't know to the root node (it's created on the fly by an ETL and I can't output the XML. Hence the presumed_root).

XML Source

<presumed_root>
    <IndexGroup>
        <IndexDate>01/01/2017</IndexDate>
        <IndexRate>US_CPI</IndexRate>
        <IndexValue>190</IndexValue>
    </IndexGroup>
    <IndexGroup>
        <IndexDate>01/04/2017</IndexDate>
        <IndexRate>US_CPI</IndexRate>
        <IndexValue>195</IndexValue>
    </IndexGroup>
    <IndexGroup>
        <IndexDate>01/07/2017</IndexDate>
        <IndexRate>US_CPI</IndexRate>
        <IndexValue>193</IndexValue>
    </IndexGroup>
    [...]
</presumed_root>

Desired XML output

<root>
<IndexGroup>
    <IndexRate>US_CPI</IndexRate>
    <IndexNumbers>
        <IndexCode>US_CPI@2017</IndexCode>
        <IndexYear>2017</IndexYear>
        <Month01>190</Month01>
        <Month02/>
        <Month03/>
        <Month04>195</Month04>
        <Month05/>
        <Month06/>
        <Month07>193</Month07>
        [...]
    </IndexNumbers>
</IndexGroup>
</root>

So, I'm trying to group values by IndexRate & IndexYear, and put every month in a distinct node.

Failing XSL

 <xsl:template match="/">

<xsl:for-each-group select="/IndexGroup" group-by="concat(translate(normalize-space(IndexRate),' ','_'),'@',xs:string(year-from-date(xs:date(IndexDate))))"> 
<IndexGroup>

    <xsl:variable name="IndexCode" select="current-group()/translate(normalize-space(IndexRate),' ','_')"/>
    <IndexRate><xsl:value-of select="$IndexCode"/></IndexRate>
    <IndexNumbers>
            <xsl:variable name="IndexCode"><xsl:value-of select="current-grouping-key()"/></xsl:variable><!-- Code = Index @ Year -->
            <xsl:variable name="Month01"><xsl:value-of select="translate(replace(current-group()[month-from-date(xs:date(current-group()/IndexDate))=1]/IndexValue, '\p{Z}+', ''),',','.')"/></xsl:variable>
            <xsl:variable name="Month02"><xsl:value-of select="translate(replace(current-group()[month-from-date(xs:date(current-group()/IndexDate))=2]/IndexValue, '\p{Z}+', ''),',','.')"/></xsl:variable>
        [...]           
    </IndexNumbers>
[...]
</IndexGroup>

With this XML structure & XSL, for-each-group doesn't group anything at all. Thus, I can't manage to get all months of a year/index combination to be filled at the same time.

Any help would be appreciated, don't hesitate to ask for further explanation / context / input / examples.

Regards,


Solution

  • I see a few issues with your XSLT...

    • You are selecting /IndexGroup in your for-each-group which would mean the root element of your XML would have to be IndexGroup.
    • You try to cast IndexDate as an xs:date, but the format DD/MM/YYYY is not a valid xs:date.
    • You're creating xsl:variable's instead of literal result elements for the children of IndexNumbers
    • Since you're grouping by IndexRate and then by IndexRate + IndexYear, I think you need to do two separate for-each-group's.

    Here's what I would do...

    XML Input

    <presumed_root>
        <IndexGroup>
            <IndexDate>01/01/2017</IndexDate>
            <IndexRate>US_CPI</IndexRate>
            <IndexValue>190</IndexValue>
        </IndexGroup>
        <IndexGroup>
            <IndexDate>01/04/2017</IndexDate>
            <IndexRate>US_CPI</IndexRate>
            <IndexValue>195</IndexValue>
        </IndexGroup>
        <IndexGroup>
            <IndexDate>01/07/2017</IndexDate>
            <IndexRate>US_CPI</IndexRate>
            <IndexValue>193</IndexValue>
        </IndexGroup>
        [...]
    </presumed_root>
    

    XSLT 2.0

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output indent="yes"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:template match="/*">
        <root>
          <xsl:for-each-group select="IndexGroup" group-by="IndexRate">
            <xsl:copy>
              <xsl:copy-of select="IndexRate"/>
              <xsl:for-each-group select="current-group()" 
                group-by="concat(IndexRate,'@',tokenize(IndexDate,'/')[last()])">
                <IndexNumbers>
                  <xsl:apply-templates select="@*"/>
                  <IndexCode>
                    <xsl:value-of select="current-grouping-key()"/>
                  </IndexCode>
                  <IndexYear>
                    <xsl:value-of select="tokenize(IndexDate,'/')[last()]"/>
                  </IndexYear>
                  <xsl:for-each select="1 to 12">
                    <xsl:variable name="month" select="format-number(.,'00')"/>
                    <xsl:variable name="pattern" select="concat('\d{2}/',$month,'/\d{2}')"/>
                    <xsl:element name="month{$month}">
                      <xsl:value-of 
                        select="current-group()[matches(IndexDate,$pattern)]/IndexValue"/>
                    </xsl:element>
                  </xsl:for-each>
                </IndexNumbers>
              </xsl:for-each-group>          
            </xsl:copy>
          </xsl:for-each-group>
        </root>
      </xsl:template>
    
    </xsl:stylesheet>
    

    Output

    <root>
       <IndexGroup>
          <IndexRate>US_CPI</IndexRate>
          <IndexNumbers>
             <IndexCode>US_CPI@2017</IndexCode>
             <IndexYear>2017</IndexYear>
             <month01>190</month01>
             <month02/>
             <month03/>
             <month04>195</month04>
             <month05/>
             <month06/>
             <month07>193</month07>
             <month08/>
             <month09/>
             <month10/>
             <month11/>
             <month12/>
          </IndexNumbers>
       </IndexGroup>
    </root>