I need to perform a query to get the candies that are most liked by kids and here's what I've got:
SELECT COUNT(*) as counts,candies.name
FROM candies
INNER JOIN kid_candy ON kid_candy.candy_id = candies.id
INNER JOIN kids ON kids.id = kid_candy.kid_id
GROUP BY candies.name
which would return:
counts | name
--------+---------
3 | snowbear
3 | whiterabbit
2 | lollipop
All I want to see would be just
counts | name
--------+---------
3 | snowbear
3 | whiterabbit
So what should my query be?
Assuming a classical n:m relationship between kids and candies like detailed here:
You should provide such details in your question.
SELECT c.id, kc.counts, c.name
FROM (
SELECT candy_id AS id, count(*) AS counts
, rank() OVER (ORDER BY count(*) DESC) AS rnk
FROM kid_candy
GROUP BY 1
) kc
JOIN candies c USING (id)
WHERE kc.rnk = 1;
This should be considerably faster than a query with two CTEs and needless joins.
It is potentially wrong (and more expensive) to group by candies.name
. The name might not be unique. Use the primary key column for that, which is probably candies.id
Assuming referential integrity we do not need to join to the table kids
at all.
Since we need to inspect the whole table, it is faster to aggregate first and join to candies to get the name
later.
You can run a window function over an aggregate function: