Search code examples
mysqlsqlquery-optimizationinner-joinsql-delete

Delete performance vs select on multiple joins?


I'm trying to perform a massive delete.

I thought that using joins instead of subqueries made it more performant.

I came up with this query:

delete t1
    from table1 t1
    join table2 t2  on t1.a = t2.a
    join table3 t3  on t2.b = t3.b;

it takes an awful long time, even when no row is deleted, although the select equivalent is instanteanous:

select *
    from table1 t1
    join table2 t2 on t1.a = t2.a
    join table3 t3 on t2.b = t3.b;

Why is that ? How could I make my first query faster ?

Edit: the execution plan

mysql> explain delete t1 from table1 t1 join table2 t2 on t1.a = t2.a join table3 t3 on t2.b = t3.b;
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys            | key      | key_len | ref      | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+
|  1 | SIMPLE      | t2    | NULL       | index | PRIMARY                  | b        | 257     | NULL     |    1 |   100.00 | Using index |
|  1 | DELETE      | t1    | NULL       | ref   | a,FK2354764DB4B32        | a        | 8       | db.t2.a  |    1 |   100.00 | NULL        |
|  1 | SIMPLE      | t3    | NULL       | ALL   | NULL                     | NULL     | NULL    | NULL     | 5000 |    10.00 | Using where |
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+

edit2: another try with select exists

mysql> explain delete from table1 t1 where exists (select 1 from table2 t2 where t2.a = t1.a and exists (select 1 from table3 t3 where t3.b = t2.b));
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
| id | select_type        | table | partitions | type   | possible_keys | key     | key_len | ref           | rows | filtered | Extra                                                             |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
|  1 | DELETE             | t1    | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          | 10000|   100.00 | Using where                                                       |
|  2 | DEPENDENT SUBQUERY | t2    | NULL       | eq_ref | PRIMARY       | PRIMARY | 8       | db.t1.a       |    1 |   100.00 | NULL                                                              |
|  2 | DEPENDENT SUBQUERY | t3    | NULL       | ALL    | NULL          | NULL    | NULL    | NULL          | 5000 |    10.00 | Using where; FirstMatch(t2); Using join buffer (Block Nested Loop)|
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
3 rows in set, 3 warnings (0.00 sec)

Thanks


Solution

  • If you are going to delete a significant number of rows in the table, it is often faster to move the rows you want to retain to another table, then truncate and reload the original table:

    -- select the rows we want to keep into a new table
    create table tmptable as 
    select *
    from table1 t1
    where not exists (
        select 1
        from table2 t2
        inner join table3 t3 on t3.b = t2.b
        where t2.a = t1.a
    );
    
    -- empty the original table
    truncate table table1;  -- !! back it up first !!
    
    -- reload it
    insert into table1 select * from tmptable;
    
    -- done
    drop table tmptable;