Search code examples
sqlsnowflake-cloud-data-platformdata-analysis

How to get the correct sum while doing group by, alongwith a where clause on columns which populate randomly for rows within the group?


I have, say, an 'order table' with multiple columns and thousands of rows. Each order id has multiple rows and "order value" column has mostly zeroes in all except 1-2 for each order. Now I need to find the summation of order value for each of the order IDs. I can simply use a sum(order_value) and do "group by" on order_id but the catch is that I also need to filter out the orders on the basis of two different columns. The filter condition values in one of the columns(Filter1 column in sample data) populate randomly and is also zeroes for rest of the rows within an order (just like "order value" column). So when I add Filter1 in where clause, the rows satisfying filter1 condition value are used while summing up but the "order value" for those rows could be zero (as "order value" is also populating randomly).

How to get correct sum of each order alongwith the filters applied. Below is a sample data for two order_id - 6180811 and 6119188:

Order_ID Filter1 Filter2 Order_value
6180811 0 NULL 0
6180811 888653 NULL 0
6180811 0 type 1 0.5
6180811 0 NULL 0
6180811 0 NULL 0
6180811 0 type 1 13.5
6119188 0 NULL 0
6119188 0 NULL 0
6119188 0 NULL 0
6119188 0 type 2 1.5
6119188 888621 NULL 0
6119188 0 type 2 15.5

Here is my sample SQL:

select order_id, sum(order_value) as total from order where Filter1 in ('888653','888621') and Filter2 in ('type 1','type 2') group by order_id;

Expected result:

order_id total
6180811 14
6119188 17

Actual result:

order_id total
6180811 0
6119188 0

Solution

  • You are probably going to need to have a JOIN and pre-aggregate the order by on its own merit, and the filter on its own. Something like

    select
          Totals.*      
       from
          ( select distinct
                  order_id
               from
                  order
               where
                      Filter1 in ('888653','888621') 
                  and Filter2 in ('type 1','type 2') ) Qualified
          JOIN
          ( select 
                  order_id,
                  sum( order_value ) TotalOrder
               from
                  order
               group by
                  order_id ) Totals
             on Qualified.Order_id = Totals.Order_id