Search code examples
mysqlinnodb

Does a MySQL DELETE physically reorder an InnoDB table?


Does a DELETE statement on an InnoDB table in MySQL cause the table to be physically reordered? I have a huge (200 million) row table that I want to delete certain records from. When I delete from it based on a WHERE condition it takes something like 90 minutes but only affects a few hundred thousand rows.

According to this stackoverflow answer, that's because deleting rows causes the table to be physically reordered on the disk: How can I improve DELETE FROM performance on large InnoDB tables?

I used the solution they provided and it worked: Make a new table and insert only the rows you want to keep, then drop the original. BUT! The DBA ran the same straight DELETE on a copy of the database and it worked in 5 minutes. Even with caching I'm surprised.

I can't find any documentation from Oracle or otherwise proving the physical reordering is happening. Does anyone know where I can find documentation stating this is the case? Or some way to prove if this is the cause?


Solution

  • How did you find the rows to delete? Is there an index? If you need to scan the whole table, that takes a lot for 200M rows.

    (Assuming InnoDB, ...)

    For each row to delete, it does these:

    • Queue up changes to the indexes.
    • Generate undo information, that will need to be cleaned out later.
    • Remove the record from the block where it lives.
    • If a block becomes 'small', coalesce with an adjacent block.

    Other comments:

    • If the table is bigger than the buffer_pool, you are likely to incur a lot of I/O.
    • If the rows are randomly arranged, there could be I/O to get to the desired rows.
    • If the key is a UUID, you are very likely to need I/O.
    • If the amount of undo stuff is 'too big', the process slows down be cause of having to work harder to save it.

    Another aspect -- if there are SELECTs against the table at the same time, there is some interaction that slows things down.

    No, the table is not rebuilt, only little parts may be rearranged.

    A possible improvement is to walk through the rows to delete, doing no more than 1000 at a time. COMMIT each change before moving on to the next chunk. More details, and more suggestions.