Search code examples
sqldatabaseanalysissnowflake-cloud-data-platform

Partition by with condition statement


I have data of products that are sold by various shops. For some shops they are sold with discount mapped by PROMO_FLG. I would like to display two COUNT PARTITION columns.

+-------------------------+--------------+---------------------+
| Store                   | Item         | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1                       |            1 |                   0 |
| 2                       |            1 |                   1 |
| 3                       |            1 |                   0 |
| 4                       |            1 |                   0 |
| 5                       |            1 |                   1 |
| 6                       |            1 |                   1 |
| 7                       |            1 |                   1 |
| 8                       |            1 |                   0 |
| 9                       |            1 |                   0 |
| 10                      |            1 |                   0 |
+-------------------------+--------------+---------------------+

First displays all shops that thave this product (which is done)

COUNT(DISTINCT STORE) OVER (PARTITION ITEM) would give is 10

Second one - which I seek - counts only these shops that have value in PROMO_FLG = 1 attribute.

That should give us value of 4


Solution

  • I think you want:

    select t.*,
           count(*) over (partition by item) as num_stores,
           sum(promo_flg) over (partition by item) as num_promo_1
    from t;
    

    If you actually need distinct counts:

    select t.*,
           count(distinct store) over (partition by item) as num_stores,
           count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
    from t;
    

    Here is a db<>fiddle. The fiddle uses Oracle because it supports COUNT(DISTINCT) as a window function.

    Here is an alternative, if the window functions don't work:

    select *
    from t join
         (select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
          from t
          group by item
         ) tt
         using (item);