Search code examples
sqlpostgresqlsnowflake-cloud-data-platformdata-partitioning

SQL - return rows in partition based on max value


I have below dataset with comments which row must be returned.

INSERT INTO rates
  (country,kg_from,kg_to,value)
VALUES
  --('pl', '0', '5', '2.5'),
  --('pl', '5', '10', '4.5'),
  --('pl', '10', '15', '6'),
  --('pl', '15', '20', '8'), -- return this row
  --('de', '0', '5', '1.5'),
  --('de', '5', '10', '1.5'),
  --('de', '10', '15', '1.5'),
  --('de', '15', '45', '1.5'),  -- return this row
  --('cz', '0', '5', '5'),
  --('cz', '5', '10', '5'),
  --('cz', '10', '15', '6'),
  --('cz', '15', '30', '4') -- return this row

Logic is: return value for max kg_to within partition of each country.

Current working code:

select t.country, t.kg_to, t.value
from rates t
inner join (select country, max(t2.kg_to) as max_kg
                      from rates t2
                      group by 1) t2 on t.country = t2.country
WHERE t.kg_to = t2.max_kg;
enter code here

Question:

  1. Shorter code would be better, any ideas on how to improve it?

Solution

  • Use distinct on:

    select distinct on (t.country) r.*
    from rates r
    order by t.country, kg_to desc;
    

    Or window functions:

    select r.*
    from (select r.*,
                 row_number() over (partition by country order by kg_to desc) as seqnum
          from rates r
         ) r
    where seqnum = 1;
    

    Note: I also don't see how your code could be retrieving duplicates, unless you have duplicate maximum values for a country in your table.