Search code examples
sqlpostgresqlaggregate-functionsgreatest-n-per-group

Getting only the highest counts per type in Postgres


Say I have a table with kids and their toys.

CREATE TABLE kids_toys (
  kid_name character varying,
  toy_type character varying,
  toy_name character varying
);
kid_name toy_type toy_name
Edward bear Pooh
Edward bear Pooh2
Edward bear Simba
Edward car Vroom
Lydia doll Sally
Lydia car Beeps
Lydia car Speedy
Edward car Red

I want to get a list of the the most popular toy type for each kid, grouped by kid. So the result would be

kid_name toy_type count
Edward bear 3
Lydia car 2

Assuming Postgres 15 as the engine, how would I query to do this? I keep getting stuck on how to generate the count but then only take the max result from each per-kid count.


Solution

  • First, group by kid_name and toy_type to find how many toys the kid has from each type.

    Then, add a row_number window function partitioned only by the kid_name and order by the count descending to find the position of each toy_type from highest count to lowest for each individual kid

    And lastly, filter only the records with row_num = 1

    Also, if you would like the top 3 toys per kid for example, you can use row_num <= 3 instead

    select kid_name, toy_type, cnt
    from
    (select kid_name, toy_type, cnt, row_number() over(partition by kid_name order by cnt desc) as row_num
      from (
        select kid_name, toy_type, count(*) as cnt
        from kids_toys
        group by kid_name, toy_type
      ) as grouped
    ) as with_row_num
    where row_num = 1