Search code examples
xmlxsltxslt-grouping

how to get unique(distinct) element value based on another element value from each record using xslt


I need to get an element value based on another element value from each record.

if RECORD_TYPE='ERROR' then I have to print all the distinct (unique) ERROR_DESC values for all records under ERRORS:ERROR_DESC

Each error description should be printed in a new line.

<root>
<SellOutRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 1:location id is invalid</ERROR_DESC>
        <RECORD_NO>1</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>336549R-001</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>LA</PRODUCT_LINE>
    </SellOutRecord>
    <SalesInRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 1:location id is invalid</ERROR_DESC>
        <RECORD_NO>1</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>336549R-001</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>LA</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>C</PRODUCT_LINE_TYPE>
    </SalesInRecord>
    <SellOutRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 2:location id is invalid</ERROR_DESC>
        <RECORD_NO>2</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>3X-KN73C-DB</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>HA</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </SellOutRecord>
    <SellOutRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 3:SO:transaction currency is invalid</ERROR_DESC>
        <RECORD_NO>3</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </SellOutRecord>
    <SalesInRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 2:location id is invalid</ERROR_DESC>
        <RECORD_NO>3</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </SalesInRecord>
    <SalesInRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 3:SI:transaction document id is invalid</ERROR_DESC>
        <RECORD_NO>3</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </SalesInRecord>
    <SalesInRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 5:SI:transaction currency is null or invalid</ERROR_DESC>
        <RECORD_NO>5</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </SalesInRecord>
    <InventoryRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 6:product id is invalid</ERROR_DESC>
        <RECORD_NO>6</RECORD_NO>
        <LOCATION_NAME>XYZ XYZ</LOCATION_NAME>
        <PRODUCT_NUMBER>331184-B21</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0S1</PRODUCT_OPTION>
        <PRODUCT_LINE>R8</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
    </InventoryRecord>
    <SellOutRecord>
        <RECORD_TYPE>ERROR</RECORD_TYPE>
        <ERROR_DESC>record 6:SO:invoiced net amount is invalid</ERROR_DESC>
        <RECORD_NO>6</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
</SellOutRecord>
    <SellOutRecord>
        <RECORD_TYPE>VALID</RECORD_TYPE>
        <RECORD_NO>7</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
</SellOutRecord>
    <SellInRecord>
        <RECORD_TYPE>VALID</RECORD_TYPE>
        <RECORD_NO>7</RECORD_NO>
        <LOCATION_NAME>XYZ el</LOCATION_NAME>
        <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
        <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
        <PRODUCT_LINE>J3</PRODUCT_LINE>
        <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
</SellInRecord>
</root>

Expected output is:

<root>
<ERROR>record 1:location id is invalid</ERROR>
<ERROR>record 2:location id is invalid</ERROR>
<ERROR>record 3:SO:transaction currency is invalid</ERROR>
<ERROR>record 3:SI:transaction document id is invalid</ERROR>
<ERROR>record 5:SI:transaction currency is null or invalid</ERROR>
<ERROR>record 6:product id is invalid</ERROR>
<ERROR>record 6:SO:invoiced net amount is invalid</ERROR>
</root>

Thanks in advance.


Solution

  • I. A pure "push-style" XSLT 1.0 solution:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>
    
     <xsl:key name="kErrDescByVal" match="*[RECORD_TYPE='ERROR']/ERROR_DESC"
              use="normalize-space()"/>
    
     <xsl:template match="/*">
         <root>
           <xsl:apply-templates select="node()|@*"/>
         </root>
     </xsl:template>
    
     <xsl:template match="*/*/*" priority="-1"/>
    
     <xsl:template match=
     "ERROR_DESC[generate-id() = generate-id(key('kErrDescByVal',normalize-space())[1])]">
      <ERROR><xsl:apply-templates/></ERROR>
     </xsl:template>
    </xsl:stylesheet>
    

    When this transformation is applied on the provided XML document:

    <root>
        <SellOutRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 1:location id is invalid</ERROR_DESC>
            <RECORD_NO>1</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>336549R-001</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>LA</PRODUCT_LINE>
        </SellOutRecord>
        <SalesInRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 1:location id is invalid</ERROR_DESC>
            <RECORD_NO>1</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>336549R-001</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>LA</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>C</PRODUCT_LINE_TYPE>
        </SalesInRecord>
        <SellOutRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 2:location id is invalid</ERROR_DESC>
            <RECORD_NO>2</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>3X-KN73C-DB</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>HA</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SellOutRecord>
        <SellOutRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 3:SO:transaction currency is invalid</ERROR_DESC>
            <RECORD_NO>3</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SellOutRecord>
        <SalesInRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 2:location id is invalid</ERROR_DESC>
            <RECORD_NO>3</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SalesInRecord>
        <SalesInRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 3:SI:transaction document id is invalid</ERROR_DESC>
            <RECORD_NO>3</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SalesInRecord>
        <SalesInRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 5:SI:transaction currency is null or invalid</ERROR_DESC>
            <RECORD_NO>5</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SalesInRecord>
        <InventoryRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 6:product id is invalid</ERROR_DESC>
            <RECORD_NO>6</RECORD_NO>
            <LOCATION_NAME>XYZ XYZ</LOCATION_NAME>
            <PRODUCT_NUMBER>331184-B21</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0S1</PRODUCT_OPTION>
            <PRODUCT_LINE>R8</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </InventoryRecord>
        <SellOutRecord>
            <RECORD_TYPE>ERROR</RECORD_TYPE>
            <ERROR_DESC>record 6:SO:invoiced net amount is invalid</ERROR_DESC>
            <RECORD_NO>6</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SellOutRecord>
        <SellOutRecord>
            <RECORD_TYPE>VALID</RECORD_TYPE>
            <RECORD_NO>7</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SellOutRecord>
        <SellInRecord>
            <RECORD_TYPE>VALID</RECORD_TYPE>
            <RECORD_NO>7</RECORD_NO>
            <LOCATION_NAME>XYZ el</LOCATION_NAME>
            <PRODUCT_NUMBER>339112-B25</PRODUCT_NUMBER>
            <PRODUCT_OPTION>0D1</PRODUCT_OPTION>
            <PRODUCT_LINE>J3</PRODUCT_LINE>
            <PRODUCT_LINE_TYPE>E</PRODUCT_LINE_TYPE>
        </SellInRecord>
    </root>
    

    the wanted, correct result is produced:

    <root>
       <ERROR>record 1:location id is invalid</ERROR>
       <ERROR>record 2:location id is invalid</ERROR>
       <ERROR>record 3:SO:transaction currency is invalid</ERROR>
       <ERROR>record 3:SI:transaction document id is invalid</ERROR>
       <ERROR>record 5:SI:transaction currency is null or invalid</ERROR>
       <ERROR>record 6:product id is invalid</ERROR>
       <ERROR>record 6:SO:invoiced net amount is invalid</ERROR>
    </root>
    

    Explanation:

    1. Use of the Muenchian grouping method.

    2. Use of normalize-space() to treat as equal two strings that differ only in their white space.


    II. XSLT 2.0 solution:

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
    
     <xsl:key name="kErrDescByVal" match="*[RECORD_TYPE='ERROR']/ERROR_DESC"
              use="normalize-space()"/>
    
     <xsl:template match="/*">
         <root>
           <xsl:for-each-group select="*[RECORD_TYPE='ERROR']/ERROR_DESC"
                               group-by="normalize-space()">
           <ERROR><xsl:apply-templates/></ERROR>
         </xsl:for-each-group>
         </root>
     </xsl:template>
    </xsl:stylesheet>
    

    When this transformation is applied on the same XML document (above), the same correct result is produced:

    <root>
       <ERROR>record 1:location id is invalid</ERROR>
       <ERROR>record 2:location id is invalid</ERROR>
       <ERROR>record 3:SO:transaction currency is invalid</ERROR>
       <ERROR>record 3:SI:transaction document id is invalid</ERROR>
       <ERROR>record 5:SI:transaction currency is null or invalid</ERROR>
       <ERROR>record 6:product id is invalid</ERROR>
       <ERROR>record 6:SO:invoiced net amount is invalid</ERROR>
    </root>
    

    Explanation:

    Use of the XSLT 2.0 instruction <xsl:for-each-group> with the group-by attribute.