I'm trying to find the best way to delete millions of records in MySQL DB. I have a table with a PK on ID and an index on 'date' column and my delete queries are like:
DELETE FROM table WHERE date < '<today - 6 months>';
It's generating a lot of delay on the slave.
I had 2 options:
DELETE FROM table WHERE date < '<today - 6 months>' LIMIT 1000;
or
Include further Indexing or using PK for deleting.
I would like to hear your opinions. If using LIMIT won't change the workload or if using PK (in combination with LIMIT) is better.
The best way to delete lots of rows, especially in a replication setup, is to walk through the table via the PRIMARY KEY
in chunks of 1000 rows.
See this for details.
If this is a recurring task, then a 'time series' PARTITIONing
is even better. (Though it won't help until you set up partitioning.) See this.