Search code examples
pythonpandasdataframejaro-winkler

Applying Jaro-Winkler distance to dataframe


I have dataframe of two columns. First one is correct strings, second is corrupted. I wanna apply Jaro-Winkler distance and store it in the new third column.

import pandas as pd
from pyjarowinkler.distance import get_jaro_distance

df = pd.DataFrame(
        {"Correct" : ['Hello' , 'bread' , 'situation'],
         "Corrupt" : ['Hlloe' , 'braed' , 'sitatuion']},
        index = [1, 2, 3])

Solution

  • df['res'] = [get_jaro_distance(x, y) for x, y in zip(df['Correct'], df['Corrupt'])]
    
        Correct Corrupt res
    1   Hello   Hlloe   0.88
    2   bread   braed   0.95
    3   situation   sitatuion   0.97