I'm trying to perform a massive delete.
I thought that using joins instead of subqueries made it more performant.
I came up with this query:
delete t1
from table1 t1
join table2 t2 on t1.a = t2.a
join table3 t3 on t2.b = t3.b;
it takes an awful long time, even when no row is deleted, although the select equivalent is instanteanous:
select *
from table1 t1
join table2 t2 on t1.a = t2.a
join table3 t3 on t2.b = t3.b;
Why is that ? How could I make my first query faster ?
Edit: the execution plan
mysql> explain delete t1 from table1 t1 join table2 t2 on t1.a = t2.a join table3 t3 on t2.b = t3.b;
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+
| 1 | SIMPLE | t2 | NULL | index | PRIMARY | b | 257 | NULL | 1 | 100.00 | Using index |
| 1 | DELETE | t1 | NULL | ref | a,FK2354764DB4B32 | a | 8 | db.t2.a | 1 | 100.00 | NULL |
| 1 | SIMPLE | t3 | NULL | ALL | NULL | NULL | NULL | NULL | 5000 | 10.00 | Using where |
+----+-------------+-------+------------+-------+--------------------------+----------+---------+----------+------+----------+-------------+
edit2: another try with select exists
mysql> explain delete from table1 t1 where exists (select 1 from table2 t2 where t2.a = t1.a and exists (select 1 from table3 t3 where t3.b = t2.b));
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
| 1 | DELETE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 10000| 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | NULL | eq_ref | PRIMARY | PRIMARY | 8 | db.t1.a | 1 | 100.00 | NULL |
| 2 | DEPENDENT SUBQUERY | t3 | NULL | ALL | NULL | NULL | NULL | NULL | 5000 | 10.00 | Using where; FirstMatch(t2); Using join buffer (Block Nested Loop)|
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------------------------------------------------------------+
3 rows in set, 3 warnings (0.00 sec)
Thanks
If you are going to delete a significant number of rows in the table, it is often faster to move the rows you want to retain to another table, then truncate and reload the original table:
-- select the rows we want to keep into a new table
create table tmptable as
select *
from table1 t1
where not exists (
select 1
from table2 t2
inner join table3 t3 on t3.b = t2.b
where t2.a = t1.a
);
-- empty the original table
truncate table table1; -- !! back it up first !!
-- reload it
insert into table1 select * from tmptable;
-- done
drop table tmptable;