Search code examples
searchexcel-formulamatchexcel-2019

How do I return an value from table 1, from a row which has all the names from a row in table 2, irrespective of what column those names appear in?


I have two tables, both have the columns

First Name|Preferred First Name|Middle Name|Last Name|Preferred Last Name

I need to match records from table 1 to table 2. And return an ID number from table 1 to display in table 2. Table 1 is from an database, table 2 contains a lot more information about these people which needs to be matched up with their ID number in table 1.

How do I return an ID number from table 1, from a row which has all the names from a row in table 2, irrespective of what column those names appear in?

enter image description here

The problem is that the names are all jumbled up. Some people have put in their first and middle name into the first name field. Some people called James have recorded their first name as Jim in table 2, which will match their preferred name, but not their first name in table 1. Some have done the same with their surname - e.g. Mr Bruggen-Pullman has recorded his surname as Bruggen because that's what he uses. It might be recorded in his preferred last name field, but that's no good if it's being checked against a different field. Some people have just got totally confused and put their last name as their first name, and so on.

I tried this approach just changing round row and column, but it didn't return anything useful.

Using a helper column is fine if it... helps.

Bonus points: Some records in table 2 may be correct, but not include all the names. So Collins [empty string in middle name field] Jado in table 2 should match the record Jado Sally Collins in table 1, but not the other way around. That is, if it says Jonas in table 2, the formula in table 2 must find a record that contains Jonas and all the other names.

Extra points if you can work out how to filter all spaces and punctuation from the name. E.g. Micheal John O'Leary-Williams should become MichaelOLearyWilliams. I don't have the 365 functions to play with like FILTER so that may not be possible.

Thanks all.


Solution

  • If you have too many records to go through manually, you might need to perform some sort of fuzzy lookup. Power Query has this capability. There's also an Excel add-in you can install for free:

    https://www.microsoft.com/en-us/download/details.aspx?id=15011

    In either case, the idea is to match rows that are similar using a similarity threshold.

    Also, I don't know if you had any control over the data collection in this situation, but I recommend using data validation whenever possible to mitigate these sorts of problems.

    Here's the output I got using the add-in. It successfully matched all 3 records.

    enter image description here