Search code examples
sqlsnowflake-cloud-data-platformgrouping

Group the ones that are the same and right next to each other in SQL


This is my table:

websit_id updated_at display_id
1222 03-06 06:00 apple
1222 03-06 08:00 apple
1222 03-06 10:00 carrot
1222 03-06 12:00 apple
1222 03-06 14:00 fig
1234 03-06 06:00 apple
1234 03-06 08:00 peach

I want to label the rows so that it groups the same display ID that are right next to each other but would not group them if there's something else in btw.

The desired output should be the following:

websit_id updated_at display_id group_label
1222 03-06 06:00 apple 1
1222 03-06 08:00 apple 1
1222 03-06 10:00 carrot 2
1222 03-06 12:00 apple 3
1222 03-06 14:00 fig 4
1234 03-06 06:00 apple 1
1234 03-06 08:00 peach 2

I am using snowflake for this.


Solution

  • You could create a flag to indicate if the current display_id is the same as the previous one and then run a cumulative sum over it to create groups

    with cte as
    
    (select *, case when display_id = lag(display_id) over (partition by websit_id order by updated_at) then 0 else 1 end as flag
    from mytable)
    
    select *, sum(flag) over (partition by websit_id order by updated_at) as group_label --add 1 if you want groups to start at 1 instead of 0
    from cte;