Search code examples
mysqlduplicatesmariadbquery-optimizationsql-delete

Delete with join not completing due to high number of rows


I have a table with approximately 550k rows. I'm trying to execute this query:

DELETE t1 FROM categories t1
INNER JOIN categories t2
WHERE t1.id < t2.id
  AND t1.name = t2.name
  AND t1.book_id = t2.book_id

Unfortunately the shell freezes up and I can tell by counting the rows in another shell that nothing is happening.

Is there any way to buffer this query, or solve this issue in another way?

Any help is appreciated.


Solution

  • If you need to delete a large number of rows, it is usually more efficient to move the rows you want to retain to another table, truncate the original table, then insert back into it:

    -- move the rows we want to keep
    create table t1_tmp as 
    select name, book_id, min(id) id from t1 group by name, book_id;
    
    -- empty the table - back it up first! 
    truncate table t1;
    
    -- insert back into the table
    insert into t1 (id, name, book_id) 
    select id, name, book_id from t1_tmp;
    
    -- drop the temporary table
    drop table t1_tmp;