sql database analysis snowflake-cloud-data-platform

Partition by with condition statement

I have data of products that are sold by various shops. For some shops they are sold with discount mapped by PROMO_FLG. I would like to display two COUNT PARTITION columns.

+-------------------------+--------------+---------------------+
| Store                   | Item         | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1                       |            1 |                   0 |
| 2                       |            1 |                   1 |
| 3                       |            1 |                   0 |
| 4                       |            1 |                   0 |
| 5                       |            1 |                   1 |
| 6                       |            1 |                   1 |
| 7                       |            1 |                   1 |
| 8                       |            1 |                   0 |
| 9                       |            1 |                   0 |
| 10                      |            1 |                   0 |
+-------------------------+--------------+---------------------+

First displays all shops that thave this product (which is done)

COUNT(DISTINCT STORE) OVER (PARTITION ITEM) would give is 10

Second one - which I seek - counts only these shops that have value in PROMO_FLG = 1 attribute.

That should give us value of 4

Solution

I think you want:

select t.*,
       count(*) over (partition by item) as num_stores,
       sum(promo_flg) over (partition by item) as num_promo_1
from t;

If you actually need distinct counts:

select t.*,
       count(distinct store) over (partition by item) as num_stores,
       count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;

Here is a db<>fiddle. The fiddle uses Oracle because it supports COUNT(DISTINCT) as a window function.

Here is an alternative, if the window functions don't work:

select *
from t join
     (select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
      from t
      group by item
     ) tt
     using (item);