Search code examples
sql-servert-sqlgroup-bypartitioninggaps-and-islands

Grouping based on Lag/Lead


In SQL Server 2014, I have the following table that tracks user activity:

USER_ID EVENT EVENT_DATE
15552221111 LOGIN 2022-06-01
15552221111 COMPLETE 2022-06-08
15552221111 LOGIN 2022-09-01
15552221111 SHUTDOWN 2022-09-11
15552222222 LOGIN 2022-04-01
15552222222 PROCESSING 2022-04-08
15552222222 PROCESSING 2022-06-10
15552222222 COMPLETE 2022-06-11
15552222222 LOGIN 2022-09-08

I need to create some sort of sequencing value, so that all records that have an event less than 60 days of each other shares the same number. Desired result:

USER_ID EVENT EVENT_DATE SEQ
15552221111 LOGIN 2022-06-01 1
15552221111 COMPLETE 2022-06-08 1
15552221111 LOGIN 2022-09-01 2
15552221111 SHUTDOWN 2022-09-11 2
15552222222 LOGIN 2022-04-01 1
15552222222 PROCESSING 2022-04-08 1
15552222222 PROCESSING 2022-06-10 2
15552222222 COMPLETE 2022-06-11 2
15552222222 LOGIN 2022-09-08 3

Totally stuck, any ideas?

Here's some test code:

WITH testTable (USERID, EVENT, EVENT_DATE) AS
(
    SELECT 15552221111, 'LOGIN', '2022-06-01'   UNION ALL
    SELECT 15552221111, 'COMPLETE', '2022-06-01'   UNION ALL
    SELECT 15552221111, 'LOGIN', '2022-09-01'    UNION ALL
    SELECT 15552221111, 'SHUTDOWN', '2022-09-11'    UNION ALL
    SELECT 15552222222, 'LOGIN', '2022-04-01'    UNION ALL
    SELECT 15552222222, 'PROCESSING',  '2022-04-08 '  UNION ALL
    SELECT 15552222222, 'PROCESSING', '2022-06-10'    UNION ALL
    SELECT 15552222222, 'COMPLETE', '2022-06-11'    UNION ALL
    SELECT 15552222222, 'LOGIN', '2022-09-08'     
)     
SELECT 
    USERID
    , EVENT
    , EVENT_DATE
    , LEAD (EVENT_DATE, 1, 0) OVER (PARTITION BY USERID ORDER BY EVENT_DATE) NEXT_DATE
    , ROW_NUMBER() OVER (PARTITION BY USERID ORDER BY EVENT_DATE) RECORD_SEQ    
FROM testTable 

Solution

  • I would use LAG() to determine the row when EVENT_DATE is more than 60 days when compare to previous row. And then perform a cumulative SUM() OVER (..) to get the SEQ that you want

    CTE AS
    (
      SELECT 
        USERID
        , EVENT
        , EVENT_DATE
        , CASE WHEN DATEDIFF(DAY
                           , LAG (EVENT_DATE, 1) OVER (PARTITION BY USERID ORDER BY EVENT_DATE)
                           , EVENT_DATE) > 60
               THEN 1
               ELSE 0
              END AS S        
      FROM testTable 
    )
    SELECT *, SUM(S) OVER (PARTITION BY USERID ORDER BY EVENT_DATE) + 1 AS SEQ
    FROM   CTE