Search code examples
mysqllimitsql-delete

How to set correct limit for mysql delete statement


I have a script that checks for duplicate pairs in a database and selects all entries that need to be deleted except for one.

I have this script that selects the first 100 entries that need to be deleted and works fine:

SELECT*  
  FROM vl_posts_testing 
  INNER JOIN (
       SELECT max(ID) AS lastId, `post_content`,`post_title`
       FROM vl_posts_testing WHERE vl_posts_testing.post_type='post'
       GROUP BY `post_content`,`post_title`
       HAVING count(*) > 1) duplic 
  ON duplic.`post_content` = vl_posts_testing.`post_content` 
  AND duplic.`post_title` = vl_posts_testing.`post_title`
WHERE vl_posts_testing.id < duplic.lastId 
AND vl_posts_testing.post_type='post' 
LIMIT 0,100

However when I try to delete this set of data using:

DELETE vl_posts_testing 
  FROM vl_posts_testing 
  INNER JOIN (
       SELECT max(ID) AS lastId, `post_content`,`post_title`
       FROM vl_posts_testing WHERE vl_posts_testing.post_type='post'
       GROUP BY `post_content`,`post_title`
       HAVING count(*) > 1) duplic 
     ON duplic.`post_content` = vl_posts_testing.`post_content` 
     AND duplic.`post_title` = vl_posts_testing.`post_title`
WHERE vl_posts_testing.id < duplic.lastId 
AND vl_posts_testing.post_type='post' 
LIMIT 100

I receive the fallowing error:

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'LIMIT 10' at line 8

The script has been constructed using this answer https://stackoverflow.com/a/6108860/1168944

Actually the script works just fine on a small amount of data without the limits set, however due to the fact that I run it on a big table (some 600k entries) I need to split this script in a routine that processes only a limited amount of data due to server limits like processor, memory etc.

Also took into consideration this example: MySQL LIMIT on DELETE statement but the result is different since no modification is executed no matter how small is the limit.


Solution

  • After several retries I have found a way to make it work:

    DELETE vl_posts_testing 
      FROM vl_posts_testing 
      INNER JOIN (
           SELECT max(ID) AS lastId, `post_content`,`post_title`
           FROM vl_posts_testing WHERE vl_posts_testing.post_type='post'
           GROUP BY `post_content`,`post_title`
           HAVING count(*) > 1
           LIMIT 0,100 ) duplic 
         ON duplic.`post_content` = vl_posts_testing.`post_content` 
         AND duplic.`post_title` = vl_posts_testing.`post_title`
    WHERE vl_posts_testing.id < duplic.lastId 
    AND vl_posts_testing.post_type='post' 
    

    Actually what I did is set an inner limit to the first set of data and compare it to the rest of the database in order to make it work. It work but I am not sure this is the correct way ot do it.