Search code examples
mysqlsqlduplicatessql-delete

Remove row if column contains duplicate data SQL


My goal is to delete/remove any rows from the ma_images table which contains a duplicate of the same piece of data in anther row, but to retain the original/first instance of that data which has been duplicated. To put it another way, if row #1 mentions 'image_1.jpg', and then rows 3, 5, & 6 also mention the dame data imageNameMaster column, i want to delete instances 3, 5, & 6 but keep the first instance (#1)'

I'm not good with sql and have been working on this for a day now, googled, read stack, researched, haven't found an example that i can understand/relate to my problem to effect a solution.

Below is an example of the table i'm working with minus some of the extra columns in the table. Below that is the lastest bit of sql i've tried so far and the error message which that attempt produced.

example of table minus a lot of columns

ERROR: You can't specify target table 'img' for update in FROM clause

delete img
from  ma_images AS img
where exists (
    select 1
    from ma_images ref
    where ref.ImgNameMaster = img.ImgNameMaster

        and ref.ImgID < img.ImgID
)

Solution

  • MySQL is finicky about using the table being updated/deleted in a subquery.

    The best solution is to use join:

    delete img
        from ma_images img JOIN
             ma_images ref
             on ref.ImgNameMaster = img.ImgNameMaster and    
                ref.ImgID < img.ImgID;
    

    This version might attempt to delete the same row multiple times. So, I would suggest:

    delete img
        from ma_images img JOIN
             (select ref.ImgNameMaster, MIN(ImgId) as min_ImgId
              from ma_images ref
              group by ref.ImgNameMaster
             ) ref
             on ref.ImgNameMaster = img.ImgNameMaster and    
                img.ImgID > ref.min_ImgID;