Search code examples
pythondictionaryfuzzywuzzy

Getting indexes from the results of a Fuzzy Matching stored in a Dictionary using Process.Extract


Using the below code, I was able to get the fuzzy-ly matched results from a dictionary, find_desc_dict, and store it inside another dictionary called complete_dict.

for i, a in enumerate(recognized_keywords_search_desc):
    complete_dict[i+1] = process.extract(search_desc_list[i], find_desc_dict, limit=10, scorer=fuzz.token_sort_ratio)

Here is an example as to clarify what the key-value pairs look like in complete_dict:

{1: [('some string', 72, 19), ('some other string', 72, 20), ('some string', 72, 19), ('some other string', 72, 20), ('some string', 72, 19), ('some other string', 72, 20), ('some string', 72, 19), ('some other string', 72, 20), ('some string', 72, 19), ('some other string', 72, 20), 2: [('some string', 89, 205), ('some other string', 71, 92), ('some string', 89, 205), ('some other string', 71, 92), ('some string', 89, 205), ('some other string', 71, 92), ('some string', 89, 205), ('some other string', 71, 92), ('some string', 89, 205), ('some other string', 71, 92)},

Basically the output structure of complete_dict is {key: [(string, ratio, index), (string, ratio, index), ... , (string, ratio, index)], key: [(string, ratio, index), (string, ratio, index), ... , (string, ratio, index)]} . I would like to learn how I can get just the indexes stored inside complete_dict.


Solution

  • So for my purposes I found the below code to work fine and return a list with all of the indexes. Granted, questions of which indexes belong to what key come to mind. The answer to that would be the first 10 elements would correspond to the key#1, the second 10 elements would correspond to the key#2 and so forth. It's also possible to convert the list into a dictionary later or store the indexes in a dictionary to begin with.

    indexes2workwith = []
    for i, a in enumerate(complete_dict):
      for x in range(10): 
        indexes2workwith.append((((complete_dict[i+1]).pop(0))[2]))