Search code examples
mysqlsqljoinwindow-functionssql-delete

Delete duplicates with condition


I have the table contacts which contains duplicate records:

id name is_contacted created_at

I need to delete duplicates, but keep the first record(among the duplicates for each name) where is_contacted=1. If among the record duplicates there are no records where is_contacted=1, just keep the first one.

This is what I have so far:

DELETE c1 FROM contacts c1
INNER JOIN contacts c2 
WHERE
    c1.id > c2.id AND 
    c1.name = c2.name;

Solution

  • Assuming that is_contacted's data type is BOOLEAN and id is the primary key of the table and this is the column that defines the order and which row should be considered first, use ROW_NUMBER window function to rank the rows of each name:

    WITH cte AS (
      SELECT *, ROW_NUMBER() OVER (PARTITION BY name ORDER BY is_contacted DESC, id) rn
      FROM contacts
    )
    DELETE t
    FROM contacts t INNER JOIN cte c
    ON c.id = t.id
    WHERE c.rn > 1;
    

    ORDER BY is_contacted DESC, id returns the rows with is_contacted = 1 at the top (if they exist).

    For versions of MySql prior to 8.0, without support of CTEs and winow functions, use a join of the table to a query that uses aggregation to get the id of the row that you want to keep:

    DELETE t
    FROM contacts t 
    INNER JOIN ( 
      SELECT name,
             COALESCE(MIN(CASE WHEN is_contacted THEN id END), MIN(id)) id         
      FROM contacts
      GROUP BY name
    ) c ON c.name = t.name AND c.id <> t.id;