Search code examples
sqlsql-servergroup-bydelete-row

How to delete certain rows depending on group by?


I have a tricky question. I need to delete rows from a table, but only those that are not the maximum entry in a group. Here is an example: Group is (Month, Fruit).

Month      | Fruit  | Quantity
2013-01-01 | Apple  | 5
2013-02-01 | Apple  | 6
2013-03-01 | Apple  | 7
2013-03-01 | Apple  | 8
2013-01-01 | Banana | 5
2013-02-01 | Banana | 6
2013-02-01 | Banana | 7

The correct transition would be (only the highest quantity should stay in each Month+Fruit group):

Month      | Fruit  | Quantity
2013-01-01 | Apple  | 5
2013-02-01 | Apple  | 6
2013-03-01 | Apple  | 8
2013-01-01 | Banana | 5
2013-02-01 | Banana | 7

(I actually need to have a full calendar-fruit table, for every fruit for every month. So if you manage to do it with adding a 2013-03 Banana, it is not a problem, it is a plus.)

Table is a declared variable @tempTable , deleting from that directly is preferred. I use SQL Server 2008 R2.


Solution

  • ;WITH x AS 
    (
      SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY Fruit, Month ORDER BY Quantity DESC)
      FROM @tempTable
    )
    SELECT * FROM x
    --DELETE x 
    WHERE rn > 1;
    

    When you are satisfied that the result identifies the correct rows that you want to delete, get rid of the SELECT and uncomment the DELETE.

    SQLfiddle demo