This query needs to delete over 17 million rows, from a table containing 20 million.
DELETE
FROM statements
WHERE agreement_id IN
(SELECT id
FROM agreements
WHERE created < DATE_SUB(CURDATE(), INTERVAL 6 MONTH));
DELETE
FROM agreements
WHERE created < DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
It takes hours to run, am I missing something that could speed things up a bit?
The subselect by itself takes a few seconds, I don't understand why the delete takes so long.
If you have this much delete to be undertaken. I suggest you to:
or
Also for your query,
never use IN clause for BIG data. Instead use exists which is more performant.
Basic script:
CREATE TABLE tmp_statements as
SELECT * FROM statements s where exists
(
select 1 FROM agreements a
WHERE
created < DATE_SUB(CURDATE(), INTERVAL 6 MONTH AND
s.agreement_id = a.agreement_id
));
DROP TABLE statements;
RENAME TABLE tmp_statements TO statements ;
--DONT FORGET TO RECREATE CREATE YOUR INDEXES, CONSTRAINTS;