In BigQuery, I'm wanting to create a ranked list by SKU sales per Season, however I'm getting a bit lost with how to do this with the current state of my code. My desired output is something like:
Season | SKU | total_spent | Rank
Spring 2020 | SKU_sample1 | $200 | 1
Spring 2020 | SKU_sample2 | $190 | 2
Spring 2020 | SKU_sample3 | $185 | 3
--- so forth, then restart the ranking when the Season changes
Season | SKU | total_spent | Rank
Halloween 2020 | SKU_sample1 | $500 | 1
Halloween 2020 | SKU_sample2 | $400 | 2
Halloween 2020 | SKU_sample3 | $300 | 3
My base code is this:
SELECT DATE(sales_time) as sales_time,
CASE
WHEN DATE(sales_time) >= '2020-04-09' AND DATE(sales_time) <= '2020-04-23' THEN 'Spring 2020'
WHEN DATE(sales_time) >= '2020-10-29' AND DATE(sales_time) <= '2020-11-02' THEN 'Halloween 2020'
WHEN DATE(sales_time) >= '2020-11-25' AND DATE(sales_time) <= '2020-12-03' THEN 'Thanksgiving 2020'
WHEN DATE(sales_time) >= '2020-12-17' AND DATE(sales_time) <= '2021-01-04' THEN 'Xmas 2020'
ELSE
'unknown_season'
END
AS season,
sku,
SUM(salesPrice) as total_spent
FROM sales_table
WHERE
DATE(sales_time) >= '2020-04-09' AND DATE(sales_time) <= '2020-04-23'
OR (DATE(sales_time) >= '2020-10-29' AND DATE(sales_time) <= '2020-11-02')
OR (DATE(sales_time) >= '2020-11-25' AND DATE(sales_time) <= '2020-12-03')
OR (DATE(sales_time) >= '2020-12-17' AND DATE(sales_time) <= '2021-01-04')
GROUP BY sku,
DATE(sales_time),
salesPrice,
season
I've broken it down a bit using CTEs to encapsulate the ranking logic:
with data as (
select
sku,
case
when date(sales_time) between '2020-04-09' and '2020-04-23' then 'Spring 2020'
when date(sales_time) between '2020-10-29' and '2020-11-02' then 'Halloween 2020'
when date(sales_time) between '2020-11-25' and '2020-12-03' then 'Thanksgiving 2020'
when date(sales_time) between '2020-12-17' and '2021-01-04' then 'Xmas 2020'
else 'unknown_season'
end as season,
sum(salesPrice) as total_spent
from `project.dataset.sales_table`
where
(
date(sales_time) between '2020-04-09' and '2020-04-23' or
date(sales_time) between '2020-10-29' and '2020-11-02' or
date(sales_time) between '2020-11-25' and '2020-12-03' or
date(sales_time) between '2020-12-17' and '2021-01-04'
)
group by 1,2
),
ranked as (
select
season,
sku,
total_spent,
-- Within each season, rank by total_spent -- could also use row_number() if you want to break ties
rank() over(partition by season order by total_spent desc) as spend_rank
from data
)
select * from ranked
order by season, spend_rank asc