Search code examples
sqlsnowflake-cloud-data-platformgreatest-n-per-grouprow-number

Return only max value from one column when all other columns are the same


I have a Snowflake query that currently outputs results as follows:

ID City Fruit Quantity
123 LA Banana 1
123 LA Strawberry 1
124 LA Banana 3
125 NY Apple 2

I would like it to return only one ID line containing the max value of the fruit, so in this case, it would look like:

ID City Fruit Quantity
123 LA Strawberry 1
124 LA Banana 3
125 NY Apple 2

I have tried doing ROW_NUMBER() OVER Partition but it only ends up selecting rows that don't have two identical IDs, I have tried select distinct but it doesn't choose distinct values for Fruit, and I have tried select max(Fruit) but I get an error saying that it's not a valid group by expression.


Solution

  • Have you tried something like this?

    select ID, City, Fruit, Quantity FROM TestData
        QUALIFY row_number() over (partition by ID, City order by Fruit DESC ) = 1;
    
    +-----+------+------------+----------+
    | ID  | CITY |   FRUIT    | QUANTITY |
    +-----+------+------------+----------+
    | 123 | LA   | Strawberry |        1 |
    | 124 | LA   | Banana     |        3 |
    | 125 | NY   | Apple      |        2 |
    +-----+------+------------+----------+