Search code examples
sqlpostgresqlgroup-bymaxgreatest-n-per-group

SQL MAX funtion where not all atributes are in the group by


So my current problem is that I have two tables that look like this:

table1(name, num_patient, quant, inst)
table2(inst_name, num_region)

Where I want to find the patient with max quantity per region.

I first had the idea of doing something like this:

SELECT num_region, num_patient, MAX(quant)
FROM
  (SELECT num_patient, quant, num_region
  FROM table1
  INNER JOIN table2
  ON table1.inst = table2.inst_name) AS joined_tables
GROUP BY num_region;

But this doesn't work since either num_patient has to be on the GROUP BY (and this way it doesn't return the max value by region anymore) or I have to remove it from the SELECT (also doesn't work because I need the name of each patient). I have tried to fix my issue with a WHERE quant = MAX() statement but couldn't get it to work. Is there any workaround to this?


Solution

  • Use DISTINCT ON:

    SELECT DISTINCT ON (num_region), num_patient, quant, num_region
    FROM table1 t1 JOIN
         table2 t2
         ON t1.inst = t2.inst_name
    ORDER BY num_region, quant DESC;
    

    DISTINCT ON is a convenient Postgres extension. It returns one row per keys specified in the SELECT, based on the ordering in the ORDER BY.

    Being an extension, not all databases support this functionality -- even databases derived from Postgres. The traditional method would use ROW_NUMBER():

    SELECT t.*
    FROM (SELECT num_patient, quant, num_region,
                 ROW_NUMBER() OVER (PARTITION BY num_region ORDER BY quant DESC) as seqnum
          FROM table1 t1 JOIN
               table2 t2
               ON t1.inst = t2.inst_name
         ) t
    WHERE seqnum = 1;