Search code examples
xsltduplicateselementchildren

XSLT remove duplicate children


I am looking for an XSLT transformation to de-duplicate the children element of a parent. In my case both parent and children are given (i.e I don't want to deduplicate any children of any element).

for example, say I want to deduplicate the <ID> children of <ROWSET>

input:

<ROWSET>
    <ROW>
         <ID> 1 </ID>
         ...
         <ID> 1 </ID>
         ...
    </ROW>
    <ROW>
         <ID> 2 </ID>
         ...
         <ID> 2 </ID>
         ...
    </ROW>
    ...
</ROWSET>

I want the output to be

<ROWSET>
    <ROW>
         <ID> 1 </ID>
         ...
    </ROW>
    <ROW>
         <ID> 2 </ID>
         ...
    </ROW>
    ...
</ROWSET>

where '...' indicates the presence of any number of any other tags.

edit: there may be anything between the two duplicate children


Solution

  • An easy and straightforward approach to ignore id which have am id with same content as previous element for same parent.

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    
        <xsl:output indent="yes"/>
        <xsl:strip-space elements="*"/>
    
        <xsl:template match="@* | node()">
            <xsl:copy>
                <xsl:apply-templates select="@* | node()" />
            </xsl:copy>
        </xsl:template>
    
        <xsl:template match ="ID" >
            <xsl:if test="not (preceding-sibling::ID/text() = current()/text())" >
                <xsl:copy>
                    <xsl:apply-templates select="@* | node()" />
                </xsl:copy>
            </xsl:if>
        </xsl:template>
    
    </xsl:stylesheet>