Search code examples
sqlsql-serverduplicatessql-updatesql-delete

Merge duplicate rows to one row from 3 tables


I have found merge rows threads but nothing as complex as this. I have these 2 tables below. As you can see I have duplicates in my first table. I need to keep only one of the rows. But before I can delete any rows I need to migrate all the license plate numbers in table 2 to only first_table_id.

id VIN
1 1234
2 1234
3 1234
4 1234
id License_Plate_Number first_table_id
101 1111 1
102 2222 2
103 3333 3
104 4444 4

After joining them on the id from the first table it would look like this

id VIN License_Plate_Number
1 1234 1111
2 1234 2222
3 1234 3333
4 1234 4444

The expected result would be this

id VIN
1 1234
id License_Plate_Number first_table_id
101 1111 1
102 2222 1
103 3333 1
104 4444 1
id VIN License_Plate_Number
1 1234 1111
1 1234 2222
1 1234 3333
1 1234 4444

Is this even possible to do?


Solution

  • As a starter: here is a small query that gives you the mapping between the "old" and the target id.

    select id, min(id) over(partition by vin) new_id from table1
    

    With this information at hand, we can re-assign the rows of the second table to the target id with the update/join syntax:

    update t2
    set first_table_id = t1.new_id 
    from table2 t2
    inner join (select id, min(id) over(partition by vin) new_id from table1) t1 
        on t1.id = t2.first_table_id
    

    We can then delete the "duplicates" in the first table:

    with cte as (select id, min(id) over(partition by vin) new_id from table1)
    delete from cte where id != new_id 
    

    Beware that the two queries need to be executed in the same transaction to avoid potential concurrency (ideally, other processes writing to the tables should be put on hold in the meantime).