Search code examples
xsltxslt-1.0xslt-2.0

Get max date from each specific user ID without repeating IDs using XSLT


I have the following XML

<?xml version='1.0' encoding='UTF-8'?>
<root>
    <Data>
        <Record>
            <User>1</User>
            <LastModified>1/1/2023</LastModified>
            <UniversityDegree>University of Texas Bachelors</UniversityDegree>
        </Record>
        <Record>
            <User>1</User>
            <LastModified>1/11/2023</LastModified>
            <UniversityDegree>University of Missouri Masters</UniversityDegree>
        </Record>
        <Record>
            <User>2</User>
            <LastModified>1/1/2024</LastModified>
            <UniversityDegree>University of Texas Bachelors</UniversityDegree>
        </Record>
        <Record>
            <User>2</User>
            <LastModified>1/12/2023</LastModified>
            <UniversityDegree>University of Missouri Masters</UniversityDegree>
        </Record>
        <Record>
            <User>3</User>
            <LastModified>5/7/2023</LastModified>
            <UniversityDegree>University of Texas Bachelors</UniversityDegree>
        </Record>
        <Record>
            <User>3</User>
            <LastModified>9/8/2023</LastModified>
            <UniversityDegree>University of Missouri Masters</UniversityDegree>
        </Record>
        <Record>
            <User>4</User>
            <LastModified>24/1/2023</LastModified>
            <UniversityDegree>University of Texas Bachelors</UniversityDegree>
        </Record>
        <Record>
            <User>4</User>
            <LastModified>28/9/2023</LastModified>
            <UniversityDegree>University of Missouri Masters</UniversityDegree>
        </Record>
        <Record>
            <User>5</User>
            <LastModified>15/3/2023</LastModified>
            <UniversityDegree>University of Texas Bachelors</UniversityDegree>
        </Record>
        <Record>
            <User>5</User>
            <LastModified>10/3/2023</LastModified>
            <UniversityDegree>University of Missouri Masters</UniversityDegree>
        </Record>
    </Data>
</root>

And I need to extract the max date of each user, so for example out of use 5 the max date from 15/3/2023 and 10/3/2023 is 15/3/2023 and show it like this:

<?xml version="1.0" encoding="UTF-8"?>
<LastModified>15/3/2023</LastModified>
<User>5</User>

I've done the following,

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="xml" indent="yes"/>
   <xsl:template match="//root">
      <xsl:for-each select="//Record">
         <xsl:sort select="number(substring(LastModified, 7, 4))" order="descending"/>
         <xsl:sort select="number(substring(LastModified, 3, 2))" order="descending"/>
         <xsl:sort select="number(substring(LastModified, 1, 2))" order="descending"/>
         
         
         <xsl:if test="position() = 1">
            <xsl:copy-of select="LastModified"/>
            <xsl:copy-of select="User"/>
            <Source>SF</Source>
         </xsl:if>

      </xsl:for-each>
   </xsl:template>
</xsl:stylesheet>

Which returns,

<?xml version="1.0" encoding="UTF-8"?>
<LastModified>1/1/2024</LastModified>
<User>2</User>
<Source>SF</Source>

But it only returns the first sorted record due to the position 1 if. I would need to get the max date of each of the users without having duplicates. If I remove the IF condition, I get everything sorted but Users are repeated,

<?xml version="1.0" encoding="UTF-8"?>
<LastModified>1/1/2024</LastModified>
<User>2</User>
<Source>SF</Source>
<LastModified>1/12/2023</LastModified>
<User>2</User>
<Source>SF</Source>
<LastModified>1/11/2023</LastModified>
<User>1</User>
<Source>SF</Source>
<LastModified>28/9/2023</LastModified>
<User>4</User>
<Source>SF</Source>
<LastModified>24/1/2023</LastModified>
<User>4</User>
<Source>SF</Source>
<LastModified>15/3/2023</LastModified>
<User>5</User>
<Source>SF</Source>
<LastModified>10/3/2023</LastModified>
<User>5</User>
<Source>SF</Source>
<LastModified>1/1/2023</LastModified>
<User>1</User>
<Source>SF</Source>
<LastModified>5/7/2023</LastModified>
<User>3</User>
<Source>SF</Source>
<LastModified>9/8/2023</LastModified>
<User>3</User>
<Source>SF</Source>


Solution

  • Try perhaps something like:

    XSLT 2.0

    <xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    
    <xsl:template match="/root">
        <result>
            <xsl:for-each-group select="Data/Record" group-by="User">
                <user>
                    <xsl:for-each select="current-group()">
                        <xsl:sort select="tokenize(LastModified, '/')[3]" data-type="number" order="descending"/>
                        <xsl:sort select="tokenize(LastModified, '/')[2]" data-type="number" order="descending"/>
                        <xsl:sort select="tokenize(LastModified, '/')[1]" data-type="number" order="descending"/>
                        <xsl:if test="position() = 1">
                            <xsl:copy-of select="LastModified, User"/>
                            <Source>SF</Source>
                        </xsl:if>
                    </xsl:for-each>
                </user>
            </xsl:for-each-group>
        </result>
    </xsl:template>
    
    </xsl:stylesheet>
    

    Caveat: not tested very thoroughly.