Search code examples
pythonloopsnlpsimilaritysentence-similarity

If condition to match two strings within two 'for loops'


Please check my code below, I am trying to iterate across two dataframes and check whether country name is same for both dataframe. But I am getting Na/NaN values error time and again and I am not able to understand why? Both datasets have no Na/NaN values but despite that I keep getting this error. Please help! Error is thrown at the IF statement. Country_name is a string such as United States, India etc.

for reviewer_id, row in data.iterrows():
    for reviewer_id, row1 in data1.iterrows():
        if data1['country_name'][row1] == data['country_name'][row]:
            similar=textdistance.Levenshtein(row.Fname_Username,row1.Fname_Username)
            data2['key1']= str(data['reviewer_id'])+'_'+str(data1['reviewer_id'])
            data2['Fname_Username']= str(data['Fname_Username'])+'_'+str(data1['Fname_Username'])
            data2['Similarity1']=similar

ValueError: cannot index with vector containing NA / NaN values


Solution

  • Take a careful look at how iterrows() works (for example here).row and row1are already the rows you want to access, you just have to get the column within them, e.g.

    if row1['country_name'] == row['country_name']: