Search code examples
sqlperformancejoinsql-deleteself-join

Slow self-join delete query


Does it get any simpler than this query?

delete a.* from matches a
    inner join matches b ON (a.uid = b.matcheduid)

Yes, apparently it does... because the performance on the above query is really bad when the matches table is very large.

matches is about 220 million records. I am hoping that this DELETE query takes the size down to about 15,000 records. How can I improve the performance of the query? I have indexes on both columns. UID and MatchedUID are the only two columns in this InnoDB table, both are of type INT(10) unsigned. The query has been running for over 14 hours on my laptop (i7 processor).


Solution

  • Deleting so many records can take a while, I think this is as fast as it can get if you're doing it this way. If you don't want to invest into faster hardware, I suggest another approach:

    If you really want to delete 220 million records, so that the table then has only 15.000 records left, thats about 99,999% of all entries. Why not

    1. Create a new table,
    2. just insert all the records you want to survive,
    3. and replace your old one with the new one?

    Something like this might work a little bit faster:

    /* creating the new table */
    CREATE TABLE matches_new
    SELECT a.* FROM matches a
    LEFT JOIN matches b ON (a.uid = b.matcheduid)
    WHERE ISNULL (b.matcheduid)
    
    /* renaming tables */
    RENAME TABLE matches TO matches_old;
    RENAME TABLE matches_new TO matches;
    

    After this you just have to check and create your desired indexes, which should be rather fast if only dealing with 15.000 records.