Search code examples
javaalgorithmsynchronizationaddressbook

Algorithm to compare people names to detect identicalness


I am working on address book synchronization algorithm. I would like to reuse some code if there exists, but couldn't find one yet.

Does someone know about an algorithm that will tell me in numbers/float/procent how much two names are identical. Levenstein distance is not good in this approach, as names and our adddress books are matching the begining of each of the name sections.

John Smith should match
Smith Jon, Jonathan Smith, Johnny Smith


Solution

  • Have a look at the Jaro Winkler algorithm too. It is good for names. http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

    If you have first name, last name issues then you could just sort them to make sure Smith John is saved as John Smith