Search code examples
sqlsql-servert-sqlsql-server-2014

Rank records based on 1 column's changing value


Q: How can I rank records based on 1 column's changing value?

I have the following data (https://pastebin.com/vdTb1JRT):

EmployeeID  Date        Onleave
ABH12345    2016-01-01  0
ABH12345    2016-01-02  0
ABH12345    2016-01-03  0
ABH12345    2016-01-04  0
ABH12345    2016-01-05  0
ABH12345    2016-01-06  0
ABH12345    2016-01-07  0
ABH12345    2016-01-08  0
ABH12345    2016-01-09  0
ABH12345    2016-01-10  1
ABH12345    2016-01-11  1
ABH12345    2016-01-12  1
ABH12345    2016-01-13  1
ABH12345    2016-01-14  0
ABH12345    2016-01-15  0
ABH12345    2016-01-16  0
ABH12345    2016-01-17  0

I would like to produce the following results:

 EmployeeID DateValidFrom    DateValidTo     OnLeave
 ABH12345   2016-01-01       2016-01-09      0
 ABH12345   2016-01-10       2016-01-13      1
 ABH12345   2016-01-14       2016-01-17      0

So I'm thinking if I can somehow create a ranked column (like shown below) that increments based on the value in the Onleave column - partitioned by the EmployeeID column.

EmployeeID  Date        Onleave    RankedCol
ABH12345    2016-01-01  0          1
ABH12345    2016-01-02  0          1
ABH12345    2016-01-03  0          1
ABH12345    2016-01-04  0          1
ABH12345    2016-01-05  0          1
ABH12345    2016-01-06  0          1
ABH12345    2016-01-07  0          1
ABH12345    2016-01-08  0          1
ABH12345    2016-01-09  0          1
ABH12345    2016-01-10  1          2
ABH12345    2016-01-11  1          2
ABH12345    2016-01-12  1          2
ABH12345    2016-01-13  1          2
ABH12345    2016-01-14  0          3
ABH12345    2016-01-15  0          3
ABH12345    2016-01-16  0          3
ABH12345    2016-01-17  0          3

Then I would be able to do the following:

SELECT
 [EmployeeID]    = [EmployeeID]
,[DateValidFrom] = MIN([Date])
,[DateValidTo]   = MAX([Date])
,[OnLeave]       = [OnLeave]
FROM table/view/cte/sub-query
GROUP BY 
 [EmployeeID]
,[OnLeave]
,[RankedCol]

Other solutions are very welcome..

Below is the test data :

WITH CTE AS ( SELECT EmployeeID = 'ABH12345', [Date] = CAST(N'2016-01-01' AS Date), [Onleave] = 0
UNION SELECT 'ABH12345', CAST(N'2016-01-02' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-03' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-04' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-05' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-06' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-07' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-08' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-09' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-10' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-11' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-12' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-13' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-14' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-15' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-16' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-17' AS Date), 0
)

SELECT * FROM CTE

Solution

  • Here is another, a bit simpler, way to get the desired output - accessing the table only once.

    -- sample of data from your question
    with t1(EmployeeID, Date1, Onleave) as(
      select 'ABH12345', cast('2016-01-01' as date),  0 union all
      select 'ABH12345', cast('2016-01-02' as date),  0 union all
      select 'ABH12345', cast('2016-01-03' as date),  0 union all
      select 'ABH12345', cast('2016-01-04' as date),  0 union all
      select 'ABH12345', cast('2016-01-05' as date),  0 union all
      select 'ABH12345', cast('2016-01-06' as date),  0 union all
      select 'ABH12345', cast('2016-01-07' as date),  0 union all
      select 'ABH12345', cast('2016-01-08' as date),  0 union all
      select 'ABH12345', cast('2016-01-09' as date),  0 union all
      select 'ABH12345', cast('2016-01-10' as date),  1 union all
      select 'ABH12345', cast('2016-01-11' as date),  1 union all
      select 'ABH12345', cast('2016-01-12' as date),  1 union all
      select 'ABH12345', cast('2016-01-13' as date),  1 union all
      select 'ABH12345', cast('2016-01-14' as date),  0 union all
      select 'ABH12345', cast('2016-01-15' as date),  0 union all
      select 'ABH12345', cast('2016-01-16' as date),  0 union all
      select 'ABH12345', cast('2016-01-17' as date),  0
    )
    -- actual query
    select max(w.employeeid) as employeeid
         , min(w.date1)      as datevalidfrom
         , max(w.date1)      as datevalidto
         , max(w.onleave)    as onleave 
      from (
            select row_number() over(partition by employeeid order by date1) -
                   row_number() over(partition by employeeid, onleave order by date1) as grp
                 , employeeid
                 , date1
                 , onleave
              from t1 s
            ) w
    group by w.grp
    order by employeeid, datevalidfrom
    

    Result:

    employeeid datevalidfrom datevalidto onleave
    ---------- ------------- ----------- -----------
    ABH12345   2016-01-01    2016-01-09  0
    ABH12345   2016-01-10    2016-01-13  1
    ABH12345   2016-01-14    2016-01-17  0