Search code examples
sqlschemapartitiongroup

Creating a new column using existing columns, but in non-exclusive conditions


This is for a case study.

I have the following table, T1 with two columns: c1, c2

    C1     C2
    1       3
    5       2
    4       10

I want to create a table T2 with both C1, C2, and a new column C3 defined in a way that

 if C1 > 3  then C3 = "C1_big"
 if C2 > 3  then C3 = "C2_big"
 else C3 = "not_big"

so using our table T1, we get

    C1     C2     C3
    1       3     "not_big"
    5       2     "C1_big"
    4       10    "C1_big"
    4       10    "C2_big"

notice that the last row satisfies conditions for both "C1_big", and "C2_big", so we have two rows for that.

I was thinking of using CASE WHEN, but then it only applies to mutually exclusive events. For our case, "C1_big", "C2_big" can both happen for a row, so it doesn't need to be mutually exclusive.

Is there a way to achieve this in SQL query? Any help would be greatly appreciated.


Solution

  • In Presto/Trino this can be achieved using array functions and unnest (using succinct syntax here):

    -- sample data
    with dataset (c1, c2) as (
        values (1, 3),
            (5, 2),
            (4, 10)
    )
    
    -- query
    select *
    from dataset,
         unnest(
            if(
                 c1 <= 3 and c2 <=3, array['not_big'],
                 filter(array[if(c1 > 3, 'C1_big'), if(c2 > 3, 'C2_big')], x -> x is not null)
            )) as t(c3);
    

    Without using Presto-specific functions - you can use union all:

    -- query
    select *, 'not_big' c3
    from dataset
    where c1 <= 3
      and c2 <= 3
    
    union all
    
    select *, 'C1_big' c3
    from dataset
    where c1 > 3
    
    union all
    
    select *, 'C2_big' c3
    from dataset
    where c2 > 3;