Search code examples
sqlsnowflake-cloud-data-platformgrouping

How to give group numbers based on a condition in sql


Here is my table. I am using Snowflake

CREATE TABLE testx
(
    c_order int,    
    c_condition varchar(10)
);

INSERT INTO testx 
VALUES (1, 'and'), (2, 'and'), (3, 'or'), (4, 'and'), (5, 'or');

SELECT * FROM testx



c_order  c_condition
--------------------
1         and
2         and
3         or
4         and
5         or

I am trying to write a query which will give me group numbers based on the fact that consecutive 'and's should be with same group number. when 'or' comes, it should increase the group number. by the way, we should maintain the c_order also.

Here is the expected result:

c_order  c_condition   group_no
-------------------------------
1         and              1
2         and              1
3         or               2
4         and              2
5         or               3

I have tried using dense_rank like this:

SELECT 
    *, 
    DENSE_RANK() OVER (ORDER BY c_condition) 
FROM testx

But it doesn't return exactly what I want. Can somebody please help?`


Solution

  • Idea is to use same value for C_ORDER as group_no if C_ORDER is more then previous OR's c_order. In CTE we only select rows with OR and assign them a group number using ROW_NUMBER() generator -

    Main query -

    with temp_cte as
    (
    select c_order,
    case -- to check if 'or' is the first row or not
    when (select min(c_order) from testx where c_condition='or') = 
    (select min(c_order) from testx)
    then row_number() over (order by c_order)
    else
    row_number() over (order by c_order)+1
    end rn
    from testx, table(generator(rowcount=>1))
    where c_condition='or'
    )
    select x.c_order, x.c_condition,
    case
    when x.c_order = w.c_order
    then w.rn
    when x.c_order > (select min(c_order) from temp_cte)
    then (select max(rn) from temp_cte where c_order < x.c_order)
    else 1
    end seq1
    from testx x left join temp_cte w
    on x.c_order = w.c_order
    order by x.c_order;
    

    Output -

    C_ORDER C_CONDITION SEQ1
    1 and 1
    2 and 1
    3 or 2
    4 or 3
    5 and 3
    6 and 3
    7 or 4
    8 and 4
    9 or 5

    For data-set

    select * from testx;
    
    C_ORDER C_CONDITION
    1 and
    2 and
    3 or
    4 or
    5 and
    6 and
    7 or
    8 and
    9 or

    Or, just use CONDITIONAL_TRUE_EVENT. Refer

    with data(C_ORDER,C_CONDITION) as(
    select * from values
    (1,'and'),
    (2,'and'),
    (3,'or'),
    (4,'or'),
    (5,'and'),
    (6,'and'),
    (7,'or'),
    (8,'and'),
    (9,'or')
    )select c_order, c_condition,
    conditional_true_event(c_condition='or') over (order by c_order) grp
    from data;
    
    C_ORDER C_CONDITION GRP
    1 and 0
    2 and 0
    3 or 1
    4 or 2
    5 and 2
    6 and 2
    7 or 3
    8 and 3
    9 or 4