Search code examples
sqlpostgresqlgreatest-n-per-group

Select first row in each GROUP BY group?


I'd like to select the first row of each set of rows grouped with a GROUP BY.

Specifically, if I've got a purchases table that looks like this:

SELECT * FROM purchases;

My Output:

id customer total
1 Joe 5
2 Sally 3
3 Joe 2
4 Sally 1

I'd like to query for the id of the largest purchase (total) made by each customer. Something like this:

SELECT FIRST(id), customer, FIRST(total)
FROM  purchases
GROUP BY customer
ORDER BY total DESC;

Expected Output:

FIRST(id) customer FIRST(total)
1 Joe 5
2 Sally 3

Solution

  • On databases that support CTE and windowing functions:

    WITH summary AS (
        SELECT p.id, 
               p.customer, 
               p.total, 
               ROW_NUMBER() OVER(PARTITION BY p.customer 
                                     ORDER BY p.total DESC) AS rank
          FROM PURCHASES p)
     SELECT *
       FROM summary
     WHERE rank = 1
    

    Supported by any database:

    But you need to add logic to break ties:

      SELECT MIN(x.id),  -- change to MAX if you want the highest
             x.customer, 
             x.total
        FROM PURCHASES x
        JOIN (SELECT p.customer,
                     MAX(total) AS max_total
                FROM PURCHASES p
            GROUP BY p.customer) y ON y.customer = x.customer
                                  AND y.max_total = x.total
    GROUP BY x.customer, x.total