Search code examples
xmlxsltmuenchian-grouping

XSLT 1.0 remove alternate duplicate records


I am trying to remove alternate duplicate records from xml using XSLT 1.0. Below is the XML I am working with.

<FileRead xmlns="http://TargetNamespace.com/EmpDetails">
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>7/19/2017</Record_Updated_Date>
   </EmployeeInformation>
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>8/19/2017</Record_Updated_Date>
   </EmployeeInformation>      
   <EmployeeInformation>
      <Empl_ID>63497</Empl_ID>
      <Record_Updated_Date>8/19/2017</Record_Updated_Date>
   </EmployeeInformation>
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>9/19/2017</Record_Updated_Date>
   </EmployeeInformation>
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>10/19/2017</Record_Updated_Date>
   </EmployeeInformation>      
</FileRead>

Expected result is

<FileRead xmlns="http://TargetNamespace.com/EmpDetails">
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>8/19/2017</Record_Updated_Date>
   </EmployeeInformation>      
   <EmployeeInformation>
      <Empl_ID>63497</Empl_ID>
      <Record_Updated_Date>8/19/2017</Record_Updated_Date>
   </EmployeeInformation>
   <EmployeeInformation>
      <Empl_ID>63496</Empl_ID>
      <Record_Updated_Date>10/19/2017</Record_Updated_Date>
   </EmployeeInformation>      
</FileRead>

The XSLT I have retains only the last of all the duplicate records. I wanted to remove only the alternate dups. Here I have 4 records for same , I want to retain #2 and #4.

<xsl:stylesheet version="1.0" xmlns:ns0="http://TargetNamespace.com/EmpDetails" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:key name="employees" match="ns0:EmployeeInformation" use="ns0:Empl_ID"/>
  <xsl:template match="/*">
    <ns0:FileRead>
      <xsl:copy-of select="*[generate-id() = generate-id(key('employees', ns0:Empl_ID)[last()])]"/>
    </ns0:FileRead>
  </xsl:template>
</xsl:stylesheet>

Solution

  • Do you need grouping here? You could just do this...

    <xsl:stylesheet version="1.0" xmlns:ns0="http://TargetNamespace.com/EmpDetails" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:template match="/*">
        <xsl:copy>
          <xsl:copy-of select="*[not(ns0:Empl_ID = following-sibling::*[1]/ns0:Empl_ID)]" />
        </xsl:copy>
      </xsl:template>
    </xsl:stylesheet>
    

    i.e Select all elements whose Empl_ID differ from the next following one.