Search code examples
mysqlgroup-bysql-delete

Deleting records based on a group by in MYSQL


I'm having trouble coming up with a query which is going to allow me to keep only the most recent order from a user (maybe a better way to say this is delete all old orders):

CREATE TABLE orders(id integer, created_at datetime, user_id integer, label nvarchar(25));

INSERT INTO orders values(1, now(), 1, 'FRED FIRST');

INSERT INTO orders values(2, DATE_ADD(now(), INTERVAL 1 DAY), 1, 'FRED SECOND');

INSERT INTO orders values(3, DATE_ADD(now(), INTERVAL 2 DAY), 1, 'FRED THIRD');

INSERT INTO orders values(4, DATE_ADD(now(), INTERVAL 1 DAY), 3, 'BARNEY FIRST');

SELECT * FROM orders;

'1','2014-03-07 08:39:36','1','FRED FIRST'
'2','2014-03-08 08:39:36','1','FRED SECOND'
'3','2014-03-09 08:39:36','1','FRED THIRD'
'4','2014-03-08 08:39:36','3','BARNEY FIRST'

I would like to run a query which would leave me with FRED's THIRD order and BARNEY's FIRST order. FRED FIRST and FRED SECOND should be deleted because they are not the latest order from FRED.

Any thoughts about how I might be able to do this with a single query?

EDIT: After posting this, I found something that works (it does what I'm looking to do)-- but it seems a bit messy:

DELETE  old_orders
FROM orders old_orders
left outer join(
SELECT MAX(created_at) as created_at, user_id
FROM orders
GROUP BY user_id) new_orders
ON new_orders.user_id = old_orders.user_id and new_orders.created_at = old_orders.created_at
WHERE new_orders.user_id is null;

Solution

  • Use a nested query, like this:

    DELETE FROM orders
    WHERE id NOT IN (
      SELECT id FROM (
        select id from orders o JOIN (
          select user_id, max(created_at) t from orders group by user_id
        ) o1 ON o.user_id = o1.user_id AND o.created_at = o1.t
      ) AS tmp
    )
    

    Working Fiddle: http://sqlfiddle.com/#!2/56d913/1