sql postgresql aggregate-functions greatest-n-per-group

Get rows with maximum count

I need to perform a query to get the candies that are most liked by kids and here's what I've got:

SELECT COUNT(*) as counts,candies.name
FROM candies 
INNER JOIN kid_candy ON kid_candy.candy_id = candies.id
INNER JOIN kids ON kids.id = kid_candy.kid_id
GROUP BY candies.name

which would return:

counts  | name
--------+---------
  3     | snowbear
  3     | whiterabbit
  2     | lollipop

All I want to see would be just

counts  | name
--------+---------
  3     | snowbear
  3     | whiterabbit

So what should my query be?

Solution

Assuming a classical n:m relationship between kids and candies like detailed here:

How to implement a many-to-many relationship in PostgreSQL?

You should provide such details in your question.

SELECT c.id, kc.counts, c.name
FROM  (
   SELECT candy_id AS id, count(*) AS counts
        , rank() OVER (ORDER BY count(*) DESC) AS rnk
   FROM   kid_candy
   GROUP  BY 1
   ) kc
JOIN   candies c USING (id)
WHERE  kc.rnk = 1;

This should be considerably faster than a query with two CTEs and needless joins.

Major points

It is potentially wrong (and more expensive) to group by candies.name. The name might not be unique. Use the primary key column for that, which is probably candies.id
Assuming referential integrity we do not need to join to the table kids at all.
Since we need to inspect the whole table, it is faster to aggregate first and join to candies to get the name later.
You can run a window function over an aggregate function:
- Get the distinct sum of a joined table column