Search code examples
python-3.xfuzzywuzzy

Setting a Threshold for fuzzywuzzy process.extractOne


I'm currently doing some string product similarity matches between two different retailers and I'm using the fuzzywuzzy process.extractOne function to find the best match.

However, I want to be able to set a scoring threshold so that the product will only match if the score is above a certain threshold, because currently it is just matching every single product based on the closest string.

The following code gives me the best match: (currently getting errors)

title, index, score = process.extractOne(text, choices_dict)

I then tried the following code to try set a threshold:

title, index, score = process.extractOne(text, choices_dict, score_cutoff=80)

Which results in the following TypeError:

TypeError: cannot unpack non-iterable NoneType object

Finally, I also tried the following code:

title, index, scorer, score = process.extractOne(text, choices_dict, scorer=fuzz.token_sort_ratio, score_cutoff=80)

Which results in the following error:

ValueError: not enough values to unpack (expected 4, got 3)


Solution

  • process.extractOne will return None, when the best score is below score_cutoff. So you either have to check for None, or catch the exception:

    best_match = process.extractOne(text, choices_dict, score_cutoff=80)
    if best_match:
        value, score, key = best_match
        print(f"best match is {key}:{value} with the similarity {score}")
    else:
        print("no match found")
    

    or

    try:
        value, score, key = process.extractOne(text, choices_dict, score_cutoff=80)
        print(f"best match is {key}:{value} with the similarity {score}")
    except TypeError:
        print("no match found")