Search code examples
phpmysqlmergeforeign-keysredundancy

How can I merge two redundant records in a MySQL table, maintaining all PK/FK relationships?


Say I have a table customers with the following fields and records:

id   first_name   last_name   email                  phone
------------------------------------------------------------------------
1    Michael      Turley      mturley@whatever.com   555-123-4567
2    John         Dohe        jdoe@whatever.com      
3    Jack         Smith       jsmith@whatever.com    555-555-5555
4    Johnathan    Doe                                123-456-7890

There are several other tables, such as orders, rewards, receipts which have foreign keys customer_id relating to this table's customers.id.

As you can see, in their infinite wisdom, my users have created duplicate records for John Doe, complete with inconsistent spelling and missing data. An administrator notices this, selects customers 2 and 4, and clicks "Merge". They are then prompted to select which value is correct for each field, etc etc and my PHP determines that the merged record should look like this:

id   first_name   last_name   email                  phone
------------------------------------------------------------------------
?    John         Doe         jdoe@whatever.com      123-456-7890

Let's assume Mr. Doe has placed several orders, earned rewards, generated receipts.. but some of these have been associated with id 2, and some have been associated with id 4. The merged row needs to match all of the foreign keys in other tables that matched the original rows.

Here's where I'm not sure what to do. My instinct is to do this:

DELETE FROM customers WHERE id = 4;

UPDATE customers
SET first_name = 'John',
    last_name  = 'Doe',
    email      = 'jdoe@whatever.com',
    phone      = '123-456-7890'
WHERE id = 2;

UPDATE orders, rewards, receipts
SET customer_id = 2
WHERE customer_id = 4;

I think that would work, but if later on I add another table that has a customer_id foreign key, I have to remember to go back and add that table to the second UPDATE query in my merge function, or risk loss of integrity.

There has to be a better way to do this.


Solution

  • As an update to my comment:

    use information_schema;
    select table_name from columns where column_name = 'customer_id';
    

    Then loop through the resulting tables and update accordingly.

    Personally, I would use your instinctive solution, as this one may be dangerous if there are tables containing customer_id columns that need to be exempt.