Search code examples
sqlsql-servergreatest-n-per-group

Finding most recent date based on consecutive dates


I have s table that lists absences(holidays) of all employees, and what we would like to find out is who is away today, and the date that they will return.

Unfortunately, absences aren't given IDs, so you can't just retrieve the max date from an absence ID if one of those dates is today.

However, absences are given an incrementing ID per day as they are inputt, so I need a query that will find the employeeID if there is an entry with today's date, then increment the AbsenceID column to find the max date on that absence.

Table Example (assuming today's date is 11/11/2014, UK format):

AbsenceID   EmployeeID    AbsenceDate
100         10            11/11/2014
101         10            12/11/2014
102         10            13/11/2014
103         10            14/11/2014
104         10            15/11/2014
107         21            11/11/2014
108         21            12/11/2014
120         05            11/11/2014
130         15            20/11/2014
140         10            01/03/2015
141         10            02/03/2015
142         10            03/03/2015
143         10            04/03/2015

So, from the above, we'd want the return dates to be:

EmployeeID     ReturnDate
10             15/11/2014
21             12/11/2014
05             11/11/2014

Edit: note that the 140-143 range couldn't be included in the results as they appears in the future, and none of the date range of the absence are today.

Presumably I need an iterative sub-function running on each entry with today's date where the employeeID matches.


Solution

  • So based on what I believe you're asking, you want to return a list of the people that are off today and when they are expected back based on the holidays that you have recorded in the system, which should only work only on consecutive days.

    SQL Fiddle Demo

    Schema Setup:

    CREATE TABLE EmployeeAbsence
        ([AbsenceID] int, [EmployeeID] int, [AbsenceDate] DATETIME)
    ;
    
    INSERT INTO EmployeeAbsence
        ([AbsenceID], [EmployeeID], [AbsenceDate])
    VALUES
        (100, 10, '2014-11-11'),
        (101, 10, '2014-11-12'),
        (102, 10, '2014-11-13'),
        (103, 10, '2014-11-14'),
        (104, 10, '2014-11-15'),
        (107, 21, '2014-11-11'),
        (108, 21, '2014-11-12'),
        (120, 05, '2014-11-11'),
        (130, 15, '2014-11-20')
    ;
    

    Recursive CTE to generate the output:

    ;WITH cte AS (
        SELECT EmployeeID, AbsenceDate
        FROM dbo.EmployeeAbsence
        WHERE AbsenceDate = CAST(GETDATE() AS DATE)
        UNION ALL
        SELECT  e.EmployeeID, e.AbsenceDate
        FROM cte
        INNER JOIN dbo.EmployeeAbsence e ON e.EmployeeID = cte.EmployeeID 
               AND e.AbsenceDate = DATEADD(d,1,cte.AbsenceDate)
        )
    SELECT cte.EmployeeID, MAX(cte.AbsenceDate) 
    FROM cte
    GROUP BY cte.EmployeeID
    

    Results:

    | EMPLOYEEID |                     Return Date |
    |------------|---------------------------------|
    |          5 | November, 11 2014 00:00:00+0000 |
    |         10 | November, 15 2014 00:00:00+0000 |
    |         21 | November, 12 2014 00:00:00+0000 |
    

    Explanation:

    The first SELECT in the CTE gets employees that are off today with this filter:

    WHERE AbsenceDate = CAST(GETDATE() AS DATE)
    

    This result set is then UNIONED back to the EmployeeAbsence table with a join that matches EmployeeID as well as the AbsenceDate + 1 day to find the consecutive days recursively using:

    -- add a day to the cte.AbsenceDate from the first SELECT
    e.AbsenceDate = DATEADD(d,1,cte.AbsenceDate) 
    

    The final SELECT simply groups the cte results by employee with the MAX AbsenceDate that has been calculated per employee.

    SELECT cte.EmployeeID, MAX(cte.AbsenceDate) 
    FROM cte
    GROUP BY cte.EmployeeID
    

    Excluding Weekends:

    I've done a quick test based on your comment and the below modification to the INNER JOIN within the CTE should exclude weekends when adding the extra days if it detects that adding a day will result in a Saturday:

    INNER JOIN dbo.EmployeeAbsence e ON e.EmployeeID = cte.EmployeeID 
           AND e.AbsenceDate = CASE WHEN datepart(dw,DATEADD(d,1,cte.AbsenceDate)) = 7 
                                    THEN DATEADD(d,3,cte.AbsenceDate) 
                               ELSE DATEADD(d,1,cte.AbsenceDate) END
    

    So when you add a day: datepart(dw,DATEADD(d,1,cte.AbsenceDate)) = 7, if it results in Saturday (7), then you add 3 days instead of 1 to get Monday: DATEADD(d,3,cte.AbsenceDate).