Search code examples
xsltxslt-2.0xslt-grouping

Total of consecutively dated time offs of the same type in xslt


I have a requirement to print consequently dated time offs of the same type into a single row with total time off being sum of units of each time off row and a start date having the start date of the oldest time off row and the end date being the latest time off row start date based on an XML below.

--XML--

<?xml version='1.0' encoding='UTF-8'?>
<Data>
    <Worker>
        <Worker_ID>12</Worker_ID>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-09-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-08-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-02-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Compassionate Leave</Type>
            <Date>2018-02-01-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Statutory Holiday</Type>
            <Date>2018-02-07-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Statutory Holiday</Type>
            <Date>2018-02-06-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
    <Worker>
        <Worker_ID>09</Worker_ID>
        <Time_Off>
            <Type>Sick Leave</Type>
            <Date>2018-02-10-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
    <Worker>
        <Worker_ID>13</Worker_ID>
        <Time_Off>
            <Type>Vacation</Type>
            <Date>2018-02-11-08:00</Date>
            <Units>1</Units>
        </Time_Off>
        <Time_Off>
            <Type>Vacation</Type>
            <Date>2018-01-10-08:00</Date>
            <Units>1</Units>
        </Time_Off>
    </Worker>
</Data>

-Desired Output--

    EmployeeID,TimeOff Type,TimeOff Start Date,TimeOff End Date,Total Units 
    12,Compassionate Leave,2018-02-08,2018-02-09,2
    12,Compassionate Leave,2018-02-01,2018-02-02,2
    12,Statutory Holiday,2018-02-06,2018-02-07,2
    09,Sick Leave,2018-02-10,2018-02-10,1
    13,Vacation,2018-02-11,2018-02-11,1
    13,Vacation,2018-01-10,2018-01-10,1

Solution

  • A similar question How to group consecutive dates in XSLT? has some suggestions on solving it, both with XSLT or with XQuery.

    https://xqueryfiddle.liberty-development.net/pPgCcoj/1 is an attempt to use the window clause in XQuery 3 to solve your problem:

    declare function local:date($input as xs:string) as xs:date {
        xs:date(substring($input, 1, 10))
    };
    
    
    string-join((
        'EmployeeID,TimeOff Type,TimeOff Start Date,TimeOff End Date,Total Units',
    for $worker in Data/Worker
    for $time-off in $worker/Time_Off
    group by $type := data($time-off/Type)
    return
        let $times := for $time in $time-off
                      order by local:date($time/Date)
                      return $time
        return 
            for tumbling window $line in $times
            start $s when true()
            end $e next $n when empty($n) or local:date($n/Date) - local:date($e/Date) ne xs:dayTimeDuration('P1D')
            return string-join(
                ($worker/Worker_ID, $type, local:date($s/Date), local:date($e/Date), sum($line/Units)), ',')), '&#10;')
    

    With XSLT you could use

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:mf="http://example.com/mf"
        exclude-result-prefixes="xs mf"
        version="3.0">
    
      <xsl:output method="text"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:function name="mf:date" as="xs:date">
          <xsl:param name="input" as="xs:string"/>
          <xsl:sequence select="xs:date(substring($input, 1, 10))"/>
      </xsl:function>
    
      <xsl:function name="mf:line" as="xs:string">
          <xsl:param name="group" as="element(Time_Off)*"/>
          <xsl:value-of 
            select="$group[1]/../Worker_ID, 
                    $group[1]/Type, 
                    mf:date($group[1]/Date),
                    mf:date($group[last()]/Date),
                    sum($group/Units)"
                    separator=","/>      
      </xsl:function>
    
      <xsl:template match="Worker">
          <xsl:for-each-group select="Time_Off" group-by="Type">
              <xsl:variable name="sorted-times" as="element(Time_Off)*">
                  <xsl:perform-sort select="current-group()">
                      <xsl:sort select="mf:date(Date)"/>
                  </xsl:perform-sort>
              </xsl:variable>
              <xsl:for-each-group select="$sorted-times" group-by="mf:date(Date) - xs:dayTimeDuration('P1D') * position()">
                  <xsl:value-of select="mf:line(current-group()) || '&#10;'"/>
              </xsl:for-each-group>
          </xsl:for-each-group>
      </xsl:template>
    
    </xsl:stylesheet>
    

    online at https://xsltfiddle.liberty-development.net/pPgCcov/1. For an XSLT 2 processor you would need to use <xsl:value-of select="concat(mf:line(current-group()), '&#10;')"/> instead of <xsl:value-of select="mf:line(current-group()) || '&#10;'"/>.