Search code examples
xsltxslt-1.0xslt-2.0

XSLT to remove duplicate child nodes


I have a source input XML where I need to remove duplicate records based on two child nodes. I am new to XSLT and have been trying various methods in XSLT 1.0 and 2.0 to accomplish this but have not been successful.

Source XML:

<?xml version="1.0" encoding="UTF-8"?>
<DataSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <EDI.Table name="EDI.Table">
        <Row>
            <GroupId>A</GroupId>
            <MemberAltId>ABC123</MemberAltId>
            <PatientSeqId>002</PatientSeqId>
            <TotalMemberDeductible>499.52</TotalMemberDeductible>
        </Row>
        <Row>
            <GroupId>A</GroupId>
            <MemberAltId>ABC123</MemberAltId>
            <PatientSeqId>002</PatientSeqId>
            <TotalMemberOop>499.52</TotalMemberOop>
        </Row>
        <Row>
            <GroupId>A</GroupId>
            <MemberAltId>DEF123</MemberAltId>
            <PatientSeqId>001</PatientSeqId>
            <TotalMemberOop>50.00</TotalMemberOop>
        </Row>
    </EDI.Table>
</DataSet>

I have looked at other XSLT examples that involve removing duplicates, but I can't seem to get my code to work. I have used the following XSLT to remove the duplicate nodes but it does not change the Source XML at all.

The XSLT 1.0 Code:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="kMemberId" match="//Row" use="concat(MemberAltId,'|',PatientSeqId)"/>

<xsl:template match="/EDI.Table">
    <root>
        <xsl:for-each select="//Row[generate-id() = generate-id(key('kMemberId',concat(MemberAltId,'|',PatientSeqId))[1])]">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:for-each>
    </root>
</xsl:template>

<xsl:template match="@*|node()">
    <xsl:copy>
               <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

The XSLT 2.0 Code:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output method="xml" indent="yes" />

<xsl:template match="/EDI.Table">
    <xsl:copy>
        <xsl:copy-of select="@* | Row"/>
        <xsl:for-each-group select="Row" group-by="MemberAltId">
            <xsl:for-each-group select="current-group()" group-by="PatientSeqId">
               <xsl:copy-of select="current-group()[1]"/>
          </xsl:for-each-group>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Expected Output XML:

<?xml version="1.0" encoding="UTF-8"?>
<DataSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <EDI.Table name="EDI.Table">
        <Row>
            <GroupId>A</GroupId>
            <MemberAltId>ABC123</MemberAltId>
            <PatientSeqId>002</PatientSeqId>
            <TotalMemberDeductible>499.52</TotalMemberDeductible>
            <TotalMemberOop>499.52</TotalMemberOop>
        </Row>
        <Row>
            <GroupId>A</GroupId>
            <MemberAltId>DEF123</MemberAltId>
            <PatientSeqId>001</PatientSeqId>
            <TotalMemberOop>50.00</TotalMemberOop>
        </Row>
    </EDI.Table>
</DataSet>

If someone can help me achieve the above expected XML that would be greatly appreciated.


Solution

  • In XSLT 2.0 you can do:

    <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    <xsl:output method="xml" indent="yes" />
    
    <xsl:template match="EDI.Table">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="Row" group-by="concat(MemberAltId, '|', PatientSeqId)">
                <Row>
                    <xsl:copy-of select="GroupId | MemberAltId | PatientSeqId"/>
                    <xsl:copy-of select="current-group()/(TotalMemberDeductible | TotalMemberOop)"/>
                </Row>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>
    

    The main problems with your attempt are:

    <xsl:template match="/EDI.Table">
    

    This tries to match a root element named EDI.Table which doesn't exist, and:

    <xsl:copy-of select="@* | Row"/>
    

    This copies all Row elements before outputting the unique ones.