I'm testing fuzzywuzzy
's process.extractBests()
as follows:
from fuzzywuzzy import process
# Define the query string
query = "Apple"
# Define the list of choices
choices = ["Apple", "Apple Inc.", "Apple Computer", "Apple Records", "Apple TV"]
# Call the process.extractBests function
results = process.extractBests(query, choices)
# Print the results
for result in results:
print(result)
It outputs:
('Apple', 100)
('Apple Inc.', 90)
('Apple Computer', 90)
('Apple Records', 90)
('Apple TV', 90)
Why didn't the scorer give 100 to all strings since they all 100% contain the query string ("Apple")?
I use fuzzywuzzy==0.18.0 with Python 3.11.7.
The fuzzywuzzy
's extractBests()
function does not give 100% because it does not check for a match, it checks for similarity, such as length of string, contents of string compared to the query, positions of the query string, and a few other factors. In your case, it does not output 100% because "Apple Inc." is not an exact match of your query, "Apple". This is why only the "Apple" choice outputs 100%, because it 100% matches with the query, "Apple". I hoped this helped!