Search code examples
sql-serverlagrow-numberranking-functionspartition-by

Reset Row Number on value change, but with repeat values in partition


I'm having trouble with something very similar to this question T-sql Reset Row number on Field Change

The solution to this question is perfect, works fine. Except when I try with multiple other 'custno', it breaks down.

What I mean by that:

custno      moddate                     who
--------------------------------------------------
581827      2012-11-08 08:38:00.000     EMSZC14
581827      2012-11-08 08:41:10.000     EMSZC14
581827      2012-11-08 08:53:46.000     EMSZC14
581827      2012-11-08 08:57:04.000     EMSZC14
581827      2012-11-08 08:58:35.000     EMSZC14
581827      2012-11-08 08:59:13.000     EMSZC14
581827      2012-11-08 09:00:06.000     EMSZC14
581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1
581827      2012-11-08 09:05:04.000     EMSZC49
581827      2012-11-08 09:06:32.000     EMSZC49
581827      2012-11-08 09:12:03.000     EMSZC49
581827      2012-11-08 09:12:38.000     EMSZC49
581827      2012-11-08 09:14:18.000     EMSZC49
581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1
-- my new rows for example of problem
581829      2012-11-08 09:12:03.000     EMSZC14 1
581829      2012-11-08 09:12:38.000     EMSZC49 1
581829      2012-11-08 09:14:18.000     EMSZC49
581829      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1

The introduction of a new custno breaks this solution, which works perfectly for the one custno.

with C1 as
(
    select 
        custno, moddate, who,
        lag(who) over(order by moddate) as lag_who
    from 
        chr
),
C2 as
(
    select 
        custno, moddate, who,
        sum(case when who = lag_who then 0 else 1 end) 
            over(order by moddate rows unbounded preceding) as change 
    from 
        C1
)
select 
    row_number() over(partition by change order by moddate) as RowID,
    custno, moddate, who
from 
    C2

I'm sure it's only a little tweak to handle multiple custno's, but this is already way beyond my capabilities and I managed to make it work for my data but that was purely by replacing column and table names. Unfortunately don't have a detailed enough understanding to resolve the issue I have.

My data looks like

custno   start_date    value

effectively exactly the same. I want a Row/rank of 1 for every time the 'value' or 'who' changes, regardless if that value/who has been seen before. This is all relative to a custno. And I do see instances where a value/who can return back to the same value as well. Again, solution above handled that 'repetition' just fine... but for the one custno

I'm thinking I just need to somehow add some sort of grouping by custno in somewhere? Just not sure where or how

Thanks!


Solution

  • This is a gaps and islands problem, and we can use the difference in row numbers method here:

    WITH cte AS (
        SELECT *,
            ROW_NUMBER() OVER (PARTITION BY custno ORDER BY moddate) rn1,
            ROW_NUMBER() OVER (PARTITION BY custno, who ORDER BY moddate) rn2
        FROM chr
    )
    
    SELECT custno, moddate, who,
        ROW_NUMBER() OVER (PARTITION BY custno, rn1 - rn2 ORDER BY moddate) rn
    FROM cte
    ORDER BY
        custno,
        moddate;
    

    screen capture of demo below

    Demo

    For an explanation of the difference in row number method used here, rn1 is just a time-ordered sequence from 1 onwards, per customer, according to the data you have shown above. The rn2 sequence is partitioned additionally by who. It is the case the difference between rn1 and rn2 will always have the same value, for each customer. It is with this difference that we then take a row number over the entire table to generate the sequence you actually want to see.