Search code examples
sqlpostgresqlduplicatessql-delete

Remove duplicates for the same product id only


Table holds the images of the products. Each row points to URL of the product's image. Many products have multiple images.

url > Image url
product_id > Product's ID

Some products have duplicate images. I need to leave only one from the duplicates and remove other duplicate URLs for that product.

I cannot group URLs and remove duplicates because there might be another row that has the same URL and have different product_id.

TABLE
-
id | product_id | url | is_primary

enter image description here


Solution

  • You can use EXISTS to delete the duplicates:

    delete from tablename t
    where exists (
      select 1 from tablename
      where product_id = t.product_id and url = t.url and id < t.id
    )
    

    This will only one of the duplicates urls for each product_id, the one with the smallest id.