Search code examples
sqlpostgresqlgreatest-n-per-group

Return the row with max value for each group


I have the following table with name t2:

  realm   |    race    | gender | total  
----------+------------+--------+--------
 Buffalo  | faerie     | F      |   5972
 Buffalo  | faerie     | M      |   2428
 Buffalo  | footballer | F      |   1954
 Buffalo  | footballer | M      |   2093
 Buffalo  | raccoon    | F      |   2118
 Buffalo  | raccoon    | M      |   1237
 Buffalo  | shark      | F      |  12497
 Buffalo  | shark      | M      |   3621
 Buffalo  | wizard     | F      |    468
 Buffalo  | wizard     | M      |  11079
 Camelot  | faerie     | F      |   2414
 Camelot  | faerie     | M      |   1455

I want to create a query that just selects the realm, race and gender with the highest total. Every time I use GROUP BY I keep getting both genders.

The output table looks like this:

  realm   |    race    | gender | total  
----------+------------+--------+--------
 Buffalo  | faerie     | F      |   5972
 Buffalo  | footballer | M      |   2093
 Buffalo  | raccoon    | F      |   2118
...

I think I have a very poor understanding on how to compare rows.
I can't figure out how to write the WHERE clause so that when I GROUP BY realm,race,gender, I only get 1 gender.


Solution

  • A perfect use case for DISTINCT ON:

    SELECT DISTINCT ON (realm, race) *
    FROM   tbl
    ORDER  BY realm, race, total DESC;
    

    db<>fiddle here

    Notably, the query has no GROUP BY at all.
    Assuming total is NOT NULL, else append NULLS LAST.
    In case of a tie, the winner is arbitrary unless you add more ORDER BY items to break the tie.

    Detailed explanation: