Search code examples
sqlpostgresqlgreatest-n-per-group

SQL for sort within group show top within group in Postgres


I have a table in Postgres that has data like this:

fk | priority | date
1       1       4-1-2023
1       1       4-3-2023
1       2       4-5-2023
2       3       4-1-2023
2       2       4-2-2023
3       3       4-6-2023

I need the results to be this:

fk | priority | date
1       1       4-3-2023
2       2       4-2-2023
3       3       4-6-2023

So within each grouping of fk I want the highest priority. 1 being the highest. And if there is a priority tie then I want the most recent date.

My SQL is not this advanced and I am having a hard time finding a solution for this. Thanks in advance.


Solution

  • This can be done with either a window function with a subquery (works for multiple RDBMS)

    SELECT fk, priority, mydate
    FROM (
        SELECT *, /* Return all the columns we want in the final query */
            ROW_NUMBER() OVER (PARTITION BY fk ORDER BY priority, mydate DESC) AS rn
            /* 
                Generate a row number, 
                restarting the numbering for each `fk` value seen (partition by)
                Order the rows, first: the smallest number `priority` (ascending),
                then: the latest `mydate` (descending)
                Put the result in a column named `rn` to use in the outer query
            */
        FROM mytable
    ) x
    WHERE rn = 1 /* Filter by the `rn` value, return only the first row */
    ;
    

    Or with a DISTINCT ON clause (more postgresql specific)

    SELECT DISTINCT ON (fk) * /* Return a distinct row for each fk value */
    FROM mytable
    ORDER BY fk, priority, mydate DESC 
    /* 
        Order by which is the first row to return 
        It is required to include `fk` first because of the DISTINCT ON (fk), 
        but then: get the smallest `priority` (ascending),
        then: the latest `mydate` (descending).
    */
    ;