I have found merge rows threads but nothing as complex as this. I have these 2 tables below. As you can see I have duplicates in my first table. I need to keep only one of the rows. But before I can delete any rows I need to migrate all the license plate numbers in table 2 to only first_table_id.
id | VIN |
---|---|
1 | 1234 |
2 | 1234 |
3 | 1234 |
4 | 1234 |
id | License_Plate_Number | first_table_id |
---|---|---|
101 | 1111 | 1 |
102 | 2222 | 2 |
103 | 3333 | 3 |
104 | 4444 | 4 |
After joining them on the id from the first table it would look like this
id | VIN | License_Plate_Number |
---|---|---|
1 | 1234 | 1111 |
2 | 1234 | 2222 |
3 | 1234 | 3333 |
4 | 1234 | 4444 |
The expected result would be this
id | VIN |
---|---|
1 | 1234 |
id | License_Plate_Number | first_table_id |
---|---|---|
101 | 1111 | 1 |
102 | 2222 | 1 |
103 | 3333 | 1 |
104 | 4444 | 1 |
id | VIN | License_Plate_Number |
---|---|---|
1 | 1234 | 1111 |
1 | 1234 | 2222 |
1 | 1234 | 3333 |
1 | 1234 | 4444 |
Is this even possible to do?
As a starter: here is a small query that gives you the mapping between the "old" and the target id
.
select id, min(id) over(partition by vin) new_id from table1
With this information at hand, we can re-assign the rows of the second table to the target id with the update
/join
syntax:
update t2
set first_table_id = t1.new_id
from table2 t2
inner join (select id, min(id) over(partition by vin) new_id from table1) t1
on t1.id = t2.first_table_id
We can then delete the "duplicates" in the first table:
with cte as (select id, min(id) over(partition by vin) new_id from table1)
delete from cte where id != new_id
Beware that the two queries need to be executed in the same transaction to avoid potential concurrency (ideally, other processes writing to the tables should be put on hold in the meantime).