Search code examples
sqlpostgresqlaggregategreatest-n-per-group

How to get the values for every group of the top 3 types


I've got this table ratings:

id user_id type value
0 0 Rest 4
1 0 Bar 3
2 0 Cine 2
3 0 Cafe 1
4 1 Rest 4
5 1 Bar 3
6 1 Cine 2
7 1 Cafe 5
8 2 Rest 4
9 2 Bar 3
10 3 Cine 2
11 3 Cafe 5

I want to have a table with a row for every pair (user_id, type) for the top 3 rated types through all users (ranked by sum(value) across the whole table).

Desired result:

user_id type value
0 Rest 4
0 Cafe 1
0 Bar 3
1 Rest 4
1 Cafe 5
1 Bar 3
2 Rest 4
3 Cafe 5
2 Bar 3

I was able to do this with two queries, one to get the top 3 and then another to get the rows where the type matches the top 3 types.

Does someone know how to fit this into a single query?


Solution

  • Get rows per user for the 3 highest ranking types, where types are ranked by the total sum of their value across the whole table.

    So it's not exactly about the top 3 types per user, but about the top 3 types overall. Not all users will have rows for the top 3 types, even if there would be 3 or more types for the user.

    Strategy:
    Aggregate to get summed values per type (type_rnk).
    Take only the top 3. (Break ties ...)
    Join back to main table, eliminating any other types.
    Order result by user_id, type_rnk DESC

    SELECT r.user_id, r.type, r.value
    FROM   ratings r
    JOIN  (
       SELECT type, sum(value) AS type_rnk
       FROM   ratings
       GROUP  BY 1
       ORDER  BY type_rnk DESC, type  -- tiebreaker
       LIMIT  3   -- strictly the top 3
       ) v USING (type)
    ORDER  BY user_id, type_rnk DESC;
    

    db<>fiddle here

    Since multiple types can have the same ranking, I added type to the sort order to break ties alphabetically by their name (as you did not specify otherwise).

    Turns out, we don't need window functions - the ones with OVER and, optionally, PARTITION for this. (Since you asked in a comment).