python algorithm machine-learning pyspark fuzzy-logic

develop a python/pyspark program to display similar kinds of words

[code_image....

it should print similar output in one col ]1>

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
query = "Apple"
#set of DATA 25 records
choices = ["apil",
    "apple",
    "Apille",
    "aple",
    "apil",
    "appple",
    "Apple APPLE",
    "Apil Orange",
    "apples"
]
process.extract(query, choices)
#### Printing Accuracy Value
print ("List of ratios: ")
print (process.extract(query, choices), "\n")
#process.extractone(query, choices)
print ("\nBest among the above list ----->",process.extractOne(query, choices))

Output:

List of ratios:

[('apple', 100), ('appple', 91), ('apples', 91), ('Apple APPLE', 90), ('aple', 89)]

Best among the above list -----> ('apple', 100)

Solution

I only had to change one line of and add another one to your snippet. You can find comments where I applied those changes, which explain what they do. I wasn't sure about the exact output format you wanted, so feel free to ask again if it's not what you wanted.

Take a look at list comprehension if you want to dig deeper into how the last line works.

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
query = "Apple"
#set of DATA 25 records
choices = ["apil",
    "apple",
    "Apille",
    "aple",
    "apil",
    "appple",
    "Apple APPLE",
    "Apil Orange",
    "apples"
]
# 1st change here
# The next line stores tuples of each choice and it's according similarity measure in a list. This entries seem to be ordered from what your snippet shows.
ordered_choices = process.extract(query, choices)
#### Printing Accuracy Value
print ("List of ratios: ")
print (process.extract(query, choices), "\n")
#process.extractone(query, choices)
print ("\nBest among the above list ----->",process.extractOne(query, choices))

# 2nd change here
# The following line takes the first element of each tuple in the list and adds is to another list, which is afterwards printed. 
print("\nOrdered choices: ", [choice for choice, value in ordered_choices])