I have this data set:
Id | PrevId | NextId | Product | Process | Date |
---|---|---|---|---|---|
1 | NULL | 4 | Product 1 | Process A | 2021-04-24 |
2 | NULL | 3 | Product 2 | Process A | 2021-04-24 |
3 | 2 | 5 | Product 2 | Process A | 2021-04-24 |
4 | 1 | 7 | Product 1 | Process B | 2021-04-26 |
5 | 3 | 6 | Product 2 | Process B | 2021-04-24 |
6 | 5 | NULL | Product 2 | Process B | 2021-04-24 |
7 | 4 | 9 | Product 1 | Process B | 2021-04-29 |
9 | 7 | 10 | Product 1 | Process A | 2021-05-01 |
10 | 9 | 15 | Product 1 | Process A | 2021-05-03 |
15 | 10 | 19 | Product 1 | Process A | 2021-05-04 |
19 | 15 | NULL | Product 1 | Process C | 2021-05-05 |
Per product, I need to tag consecutive/islands of records that have the same Process like:
Id | PrevId | NextId | Product | Process | Date | Tag |
---|---|---|---|---|---|---|
1 | NULL | 4 | Product 1 | Process A | 2021-04-24 | 1 |
4 | 1 | 7 | Product 1 | Process B | 2021-04-26 | 2 |
7 | 4 | 9 | Product 1 | Process B | 2021-04-29 | 2 |
9 | 7 | 10 | Product 1 | Process A | 2021-05-01 | 3 |
10 | 9 | 15 | Product 1 | Process A | 2021-05-03 | 3 |
15 | 10 | 19 | Product 1 | Process A | 2021-05-04 | 3 |
19 | 15 | NULL | Product 1 | Process C | 2021-05-05 | 4 |
A product goes through multiple Process-es and can go through the same one more than one time.
I basically need to produce the Tag column, the logic behind it is consecutive records with the same Process should be grouped together but a caveat is that the same process can appear further down the line but should be treated as a new group.
I have tried the basic windowing functions (ROW_NUMBER
and DENSE_RANK
) but the problem is that those count within the partition and not across partitions.
If you don't have to validate prevId and nextId (that is if your data is already correctly ordered) you could try the following:
WITH cte AS(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY Product ORDER BY [Date]) x
, DENSE_RANK() OVER (PARTITION BY Product, Process ORDER BY [Date]) y
FROM T1
WHERE product = 'Product 1'
),
cteTag AS(
SELECT Id, PrevId, NextId, Product, Process, [Date], x-y AS Tag_
FROM cte
)
SELECT Id, PrevId, NextId, Product, Process, [Date], DENSE_RANK() OVER (PARTITION BY Product ORDER BY Tag_) AS Tag
FROM cteTag
ORDER BY [Date]