The task is to
return the top 5 customer ids and their rankings based on their spend for each store.
There are only 2 tables - payment and customer. There are 2 stores in total.
For the store_id = 2, the rank() gives repeating 1,2,2,3,4,5 values which is 6. I dont know how to choose the 5 with sql code. Since it is actually 6 - i can't "limit 5"
the sqlfiddle is here. I can't make it do the row_number() in sqlfiddle.
My own query:
with help2 as(
select
store_id,
customer.customer_id,
sum(amount) as revenue
from customer inner join payment on
customer.customer_id = payment.customer_id
where store_id = 2
group by
customer.customer_id
order by revenue desc)
select
store_id, customer_id, revenue,
rank() over(order by revenue desc) as ranking2
from help2
order by ranking2 limit 6 -- i dont think there should be limit 6, this is hard coding
the expected answer is :
store_id customer_id revenue ranking
2 526 221.55 1
2 178 194.61 2
2 137 194.61 2
2 469 177.60 3
2 181 174.66 4
2 259 170.67 5
The short answer is to use DENSE_RANK()
, not RANK()
. This will give you the correct ranking numbers that match the linked exercise's sample output.
From What’s the Difference Between RANK and DENSE_RANK in SQL?:
Unlike
DENSE_RANK
,RANK
skips positions after equal rankings. The number of positions skipped depends on how many rows had an identical ranking. For example, Mary and Lisa sold the same number of products and are both ranked as #2. WithRANK
, the next position is #4; withDENSE_RANK
, the next position is #3.
WITH CustomerSpending AS (
SELECT customer_id
,SUM(amount) as revenue
FROM payment
GROUP BY customer_id
)
,CustomerSpendingRanking AS (
SELECT customer.store_id
,customer.customer_id
,CustomerSpending.revenue
,DENSE_RANK() OVER (PARTITION BY customer.store_id ORDER BY CustomerSpending.revenue DESC) as ranking
FROM customer
JOIN CustomerSpending
ON customer.customer_id = CustomerSpending.customer_id
)
SELECT store_id
,customer_id
,revenue
,ranking
FROM CustomerSpendingRanking
WHERE ranking < 6;