Search code examples
xmlxsltxslt-2.0xslt-grouping

XSLT Grouping tokenized values by condition


I'm having difficulty grouping the repeating elements to a new xml.

My input file is

<Load>
    <DataArea>
        <tag>
            <row>A,Header</row>
        </tag>

        <tag>
            <row>B,20190701</row>
        </tag>
        <tag>
            <row>C,12345, 100.00, 200.00</row>
        </tag>
        <tag>
            <row>D,001, 25.00</row>
        </tag>
        <tag>
            <row>D,002, 35.00</row>
        </tag>
        <tag>
            <row>D,003, 45.00</row>
        </tag>

        <tag>
            <row>B,20190702</row>
        </tag>
        <tag>
            <row>C,12345, 300.00, 400.00</row>
        </tag>
        <tag>
            <row>D,004, 55.00</row>
        </tag>
        <tag>
            <row>D,005, 65.00</row>
        </tag>
        <tag>
            <row>D,006, 75.00</row>
        </tag>

    </DataArea>
</Load>

I have to tokenize the comma separated element values and want to transform it to the xml structure below.

<Load>
    <DataArea>
        <Header>
            <A>Header</A>
        </Header>

        <Line>
            <!-- July 1 Record -->
            <B>20190701</B>
            <C>12345</C>
            <D>
                <code>001</code>
                <amount>25.00</amount>
            </D>
            <D>
                <code>002</code>
                <amount>35.00</amount>
            </D>
            <D>
                <code>003</code>
                <amount>45.00</amount>
            </D>
        </Line>

        <Line>
            <!-- July 2 Record -->
            <B>20190702</B>
            <C>12345</C>
            <D>
                <code>004</code>
                <amount>55.00</amount>
            </D>
            <D>
                <code>005</code>
                <amount>65.00</amount>
            </D>
            <D>
                <code>006</code>
                <amount>75.00</amount>
            </D>
        </Line>
    </DataArea>
</Load>

There are 2 <Lines></Lines> because there are only two dates July 1 and July 2. Dates are represented as value for element <B>

There are multiple <D>'s for every <Line>

My problem is I couldn't group <B><C> and <D>'s together enclosed by <Line>. The <D>'s should belong to the right <B> or date.

Below is my xslt code.

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" />
    <xsl:strip-space elements="*" />
    <xsl:template match="/">
        <Load>
            <DataArea>
                <Header>
                    <A>
                        <xsl:for-each select="Load/DataArea/tag">
                            <xsl:variable name="col"
                                select="tokenize(current(),',')" />
                            <xsl:if test="$col[1] = 'A' ">
                                <xsl:value-of select="$col[2]" />
                            </xsl:if>
                        </xsl:for-each>

                    </A>
                </Header>

                <xsl:for-each select="Load/DataArea/tag">
                    <xsl:variable name="column"
                        select="tokenize(current(),',')" />
                    <Line>
                        <xsl:if test="$column[1] = 'B' ">
                            <B>
                                <xsl:value-of select="$column[2]" />
                            </B>
                        </xsl:if>
                        <xsl:if test="$column[1] = 'C' ">
                            <C>
                                <xsl:value-of select="$column[2]" />
                            </C>
                        </xsl:if>
                        <xsl:for-each select="../tag">
                            <xsl:variable name="column"
                                select="tokenize(current(),',')" />
                            <xsl:if test="$column[1] = 'D' ">
                                <D>
                                    <code>
                                        <xsl:value-of select="$column[2]" />
                                    </code>
                                    <amount>
                                        <xsl:value-of select="$column[3]" />
                                    </amount>
                                </D>
                            </xsl:if>
                        </xsl:for-each>
                    </Line>
                </xsl:for-each>
            </DataArea>
        </Load>
    </xsl:template>
</xsl:stylesheet>

I am getting all <D>'s per loop instead of just those that belong to the Date/<B> Repeats all <D>'s and <Line>

I'm not sure how to solve this especially that I need to tokenize them.

I'd appreciate any help.

Thank you.


Solution

  • I think this may be a job for xsl:for-each-group with the group-starting-with attribute.

    Try this XSLT, which I have also simplified around getting the A header using basic starts-with functionality

    <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" indent="yes" />
      <xsl:strip-space elements="*" />
    
      <xsl:template match="/">
        <Load>
          <DataArea>
            <Header>
              <A>
                <xsl:value-of select="substring-after(Load/DataArea/tag[starts-with(row, 'A,')], ',') " />
              </A>
            </Header>          
            <xsl:for-each-group select="Load/DataArea/tag[not(starts-with(row, 'A,'))]" group-starting-with="tag[starts-with(row, 'B,')]">
              <Line>
                <xsl:for-each select="current-group()">
                  <xsl:variable name="column" select="tokenize(row,',')" />
                  <xsl:element name="{$column[1]}">
                  <xsl:choose>
                    <xsl:when test="$column[1] = 'B' or $column[1] = 'C'">
                      <xsl:value-of select="$column[2]" />
                    </xsl:when>
                    <xsl:otherwise>
                      <code>
                        <xsl:value-of select="$column[2]" />
                      </code>
                      <amount>
                        <xsl:value-of select="$column[3]" />
                      </amount>
                    </xsl:otherwise>
                  </xsl:choose>
                </xsl:element>
                </xsl:for-each>
              </Line>
            </xsl:for-each-group>
          </DataArea>
        </Load>
      </xsl:template>
    </xsl:stylesheet>