Search code examples
xslt-2.0xslt-3.0

How to remove duplicates without changing the position of the segment


i have a requirement to remove the duplicate ITEM segments without checking its sub node and its values, and without alerting its position, XSLT i used is currently checking exact ITEM segment match, so its not removing and placing the ITEM field at last. if multiple ITEM has same N11 and AR value then we can consider it as duplicate and we need to keep anyone occurrence, but if ITEM has Subnode exists then we need to consider that particular ITEM only, by removing other ITEMs

Input sample

<?xml version="1.0" encoding="UTF-8"?>
<D02X001>
    <DOC BEGIN="1">
        <DC40 SEGMENT="1">
            <NAM>DC40</NAM>
        </DC40>
        <BXYH SEGMENT="1">
            <LDAT>date</LDAT>
            <UDAT>date1</UDAT>
            <BXYI SEGMENT="1">
                <TNR>123453</TNR>
                <ORT>1000</ORT>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                </ITEM>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <subnode>
                        <field1> 13</field1>
                    </subnode>
                </ITEM>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                </ITEM>
                <ITEMNEW SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <PQC>NU</PQC>
                    <QTY>3456</QTY>
                    <NUM/>
                    <ASCD/>
                </ITEMNEW>
            </BXYI>
            <BXYI SEGMENT="1">
                <TNR>123453</TNR>
                <ORT>1000</ORT>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                </ITEM>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <subnode>
                        <field1> 13</field1>
                    </subnode>
                </ITEM>
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                </ITEM>
                <ITEMNEW SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <PQC>NU</PQC>
                    <QTY>3456</QTY>
                    <NUM/>
                    <ASCD/>
                </ITEMNEW>
            </BXYI>
        </BXYH>
    </DOC>
</D02X001>

output sample

<?xml version="1.0" encoding="UTF-8"?>
<D02X001>
    <DOC BEGIN="1">
        <DC40 SEGMENT="1">
            <NAM>DC40</NAM>
        </DC40>
        <BXYH SEGMENT="1">
            <LDAT>date</LDAT>
            <UDAT>date1</UDAT>
            <BXYI SEGMENT="1">
                <TNR>123453</TNR>
                <ORT>1000</ORT>         
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <subnode>
                        <field1> 13</field1>
                    </subnode>
                </ITEM>     
                <ITEMNEW SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <PQC>NU</PQC>
                    <QTY>3456</QTY>
                    <NUM/>
                    <ASCD/>
                </ITEMNEW>
            </BXYI>
            <BXYI SEGMENT="1">
                <TNR>123453</TNR>
                <ORT>1000</ORT>     
                <ITEM SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <subnode>
                        <field1> 13</field1>
                    </subnode>
                </ITEM>     
                <ITEMNEW SEGMENT="1">
                    <N11>6789</N11>
                    <AR>03</AR>
                    <PQC>NU</PQC>
                    <QTY>3456</QTY>
                    <NUM/>
                    <ASCD/>
                </ITEMNEW>
            </BXYI>
        </BXYH>
    </DOC>
</D02X001>

XSLT I used

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">
  
   <xsl:key name="item-by-N11-AR" match="BXYI/ITEM" composite="yes" use="N11, AR"/>

  <xsl:template match="BXYI/ITEM[not(key('item-by-N11-AR', (N11, AR))[subnode])][not(. is key('item-by-N11-AR', (N11, AR))[1])]"/>
  <xsl:template match="BXYI/ITEM[key('item-by-N11-AR', (N11, AR))[subnode]][not(. is key('item-by-N11-AR', (N11, AR))[subnode][1])]"/>

  <xsl:mode on-no-match="shallow-copy"/>



</xsl:stylesheet>


Solution

  • You can try it with a composite key and then additionally check for subnode and empty templates for the duplicates you don't want to output e.g.

      <xsl:key name="item-by-N11-AR" match="ITEM" composite="yes" use="N11, AR"/>
    
      <xsl:template match="ITEM[not(key('item-by-N11-AR', (N11, AR))[subnode])][not(. is key('item-by-N11-AR', (N11, AR))[1])]"/>
      <xsl:template match="ITEM[key('item-by-N11-AR', (N11, AR))[subnode]][not(. is key('item-by-N11-AR', (N11, AR))[subnode][1])]"/>
    
      
      <xsl:mode on-no-match="shallow-copy"/>
    

    Online fiddle.

    This currently eliminates ITEM duplicates at any level of the document.

    To restrict the elimination of duplicates to ITEM children of BXYI and to work with multiple BXYI elements change the code to use e.g.

      <xsl:key name="item-by-N11-AR" match="BXYI/ITEM" composite="yes" use="N11, AR"/>
    
      <xsl:template match="BXYI/ITEM[not(key('item-by-N11-AR', (N11, AR))[subnode])][not(. is key('item-by-N11-AR', (N11, AR), ..)[1])]"/>
      <xsl:template match="BXYI/ITEM[key('item-by-N11-AR', (N11, AR))[subnode]][not(. is key('item-by-N11-AR', (N11, AR), ..)[subnode][1])]"/>
    
    
      <xsl:mode on-no-match="shallow-copy"/>