Search code examples
sqlgoogle-bigquerylegacy

IF column has a duplicated value THEN choose the next one


Thanks for looking into this.

How do I write an IF statement in Bigquery SQL, to sum up, the numbers of units (If the QUARTER columns have the SAME quarter then choose the unit number with TYPE ACTL, OTLK, ETC). TYPE CWV are historical data so it's ok to have duplicated quarters but if there are units with TYPE ACTL, OTLK, etc with the same quarter as in CWV then I would like the SQL to choose UNITS with TYPE ACTL, OTLK, etc over CWV.

Note: this sql will be used for Tableau and my total of units for XYZ should be 977071 (excluding the CWV / FY20-Q4 / 104955)

Row TMC_BC TYPE PROGRAM QUARTER UNITS
1 TMC ACTL XYZ FY20-Q2 0
2 TMC OTLK XYZ FY20-Q3 6500
3 TMC CWV XYZ FY20-Q4 104955
4 TMC ACTL XYZ FY20-Q4 191300
5 TMC CWV XYZ FY21-Q1 182448
6 TMC_BC CWV XYZ FY21-Q2 91346
7 TMC CWV XYZ FY21-Q2 158126
8 TMC_BC CWV XYZ FY21-Q1 62500
9 TMC CWV XYZ FY21-Q1 157287
10 TMC_BC CWV XYZ FY21-Q1 62500
11 TMC CWV XYZ FY21-Q1 31410
12 TMC_BC CWV XYZ FY21-Q1 33654

my current SQL:

SELECT
case when tmc <= 0 then 'TMC_BC'
      else 'TMC'
      end as TMC_BC,
TYPE,
PROGRAM,
QUARTER,
sum(ca) as UNITS,
from xx_REPORTS
where PROGRAM like 'XYZ'
and TYPE <> 'PDP' and TYPE <> 'POR'
and GEO = 'NA'
group by 1,2,3,4
order by 4 asc

Solution

  • Hmmm . . . If I understand correctly, you want 'CWV' only when there are no other types for the quarter/PROGRAM/tmc_bc combination. You can use window functions for this filtering:

    select r.*
    from (select r.*,
                 count(*) over (partition by quarter, program, tmc_bc) as q_cnt,
                 countif(type = 'CWV') over (partition by quarter, program, tmc_bc) as q_cnt_cwv
          from xx_REPORTS r
         ) r
    where (q_cnt = q_cnt_cwv) or (type <> 'CWV');
    

    Actually, if the logic is correct, then what you want are non-CMV rows if there are any, otherwise all rows. A simpler expression is:

    select r.*
    from (select r.*,
                 countif(type <> 'CWV') over (partition by quarter, program, tmc_bc) as q_cnt_noncwv
          from xx_REPORTS r
         ) r
    where (q_cnt_noncwv > 0 and type <> 'CWV') or q.cnt_noncwv = 0;