I would need to get back the value names from indices. My dataset is as follows
try_test = pd.DataFrame({'word': ['apple', 'orange', 'diet', 'energy', 'fire', 'cake'],
'name': ['dog', 'cat', 'mad cat', 'good dog', 'bad dog', 'chicken']})
word name
0 apple dog
1 orange cat
2 diet mad cat
3 energy good dog
4 fire bad dog
5 cake chicken
Using this function:
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [i for i, x in enumerate(matches) if x]
try_test.apply(lambda row: func(row['name']), axis=1)
I got the following values:
0 [0, 3, 4]
1 [1, 2]
2 [1, 2]
3 [0, 3]
4 [0, 4]
5 [5]
I would like to have the word fields instead of the indices.
Expected output:
0 [apple, energy, fire]
1 [orange, diet]
2 [orange, diet]
3 [apple, energy]
4 [apple, fire]
5 [cake]
Any suggestions will be greatly appreciate.
Change your function from i
to try_test.word[i]
def func(name):
matches = try_test.apply(lambda row: (fuzz.partial_ratio(row['name'], name) >= 85), axis=1)
return [try_test.word[i] for i, x in enumerate(matches) if x]