Search code examples
python-3.xstringcomparison

Comparing names in different formats using Python


I want to compare names which are in different formats, eg: "George W. Bush", "George Bush", "George Walker Bush", "Bush, George Walker", "Bush, GW", "Bush, George" etc. There are few with dots (".") as well, but I omitted those from the list because I will normalize those anyways. In fact, the commas (",") will be stripped as well.

What is the best and optimized approach to determine if any 2 given names actually represent the same person? I have thought of using nameparser and build a comparison algorithm, but please provide any other possible options. Any approach using standard modules of Python will be fine too.


Solution

  • There's an open source library which can be useful, or at least can be used as base to build more functionalities.

    https://github.com/rliebz/whoswho

    Sample usage:

    >>> from whoswho import who
    >>> who.match('Bush, G.W.', 'George W. Bush')