Search code examples
sqlsql-serverdense-ranksql-rank

Using RANK OVER PARTITION to Compare a Previous Row Result


I'm working with a dataset that contains (among other columns) a userID and startDate. The goal is to have a new column "isRehire" that compares their startDate with previous startDates.

If the difference between startDates is within 1 year, isRehire = Y.

The difficulty and my issue comes in when there are more than 2 startDates for a user. If the difference between the 3rd and 1st startDate is over a year, the 3rd startDate would be the new "base date" for being a rehire.

userID startDate isRehire
123 07/24/19 N
123 02/04/20 Y
123 08/25/20 N
123 12/20/20 Y
123 06/15/21 Y
123 08/20/21 Y
123 08/30/21 N

In the above example you can see the issue visualized. The first startDate 07/24/19, the user is not a Rehire. The second startDate 02/04/20, they are a Rehire. The 3rd startDate 08/25/20 the user is not a rehire because it has been over 1 year since their initial startDate. This is the new "anchor" date.

The next 3 instances are all Y as they are within 1 year of the new "anchor" date of 08/25/20. The final startDate of 08/30/21 is over a year past 08/25/20, indicating a "N" and the "cycle" resets again with 08/30/21 as the new "anchor" date.

My goal is to utilize RANK OVER PARTITION to be able to complete this, as from my testing I believe there must be a way to assign ranks to the dates which can then be wrapped in a select statement for a CASE expression to be written. Although it's completely possible I'm barking up the wrong tree entirely.

Below you can see some of the code I've attempted to use to complete this, although without much success so far.

select TestRank,
startDate,
userID,
CASE WHEN TestRank = TestRank THEN (TestRank - 1
                                            ) ELSE '' END AS TestRank2
from
(

select userID,
startDate
RANK() OVER (PARTITION BY userID
            ORDER BY startDate desc)
            as TestRank
from [MyTable] a
WHERE a.userID = [int]

) b 

Solution

  • This is complicated logic, and window functions are not sufficient. To solve this, you need iteration -- or in SQL-speak, a recursive CTE:

    with t as (
          select t.*, row_number() over (partition by id order by startdate) as seqnum
          from mytable t
         ),
         cte as (
          select t.id, t.startdate, t.seqnum, 'N' as isrehire, t.startdate as anchordate
          from t
          where seqnum = 1
          union all
          select t.id, t.startdate, t.seqnum,
                 (case when t.startdate > dateadd(year, 1, cte.anchordate) then 'N' else 'Y' end),
                 (case when t.startdate > dateadd(year, 1, cte.anchordate) then t.startdate else cte.anchordate end)
          from cte join
               t
               on t.seqnum = cte.seqnum + 1
         )
    select *
    from cte
    order by id, startdate;
    

    Here is a db<>fiddle.