I am working on address book synchronization algorithm. I would like to reuse some code if there exists, but couldn't find one yet.
Does someone know about an algorithm that will tell me in numbers/float/procent how much two names are identical. Levenstein distance is not good in this approach, as names and our adddress books are matching the begining of each of the name sections.
John Smith
should match
Smith Jon
, Jonathan Smith
, Johnny Smith
Have a look at the Jaro Winkler algorithm too. It is good for names. http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
If you have first name, last name issues then you could just sort them to make sure Smith John is saved as John Smith