I'm a newbie and struggling with what I'm sure is a simple task.
I have a list of words taken from POS tagging
:
words = ['drink', 'drinking']
And I want to lemmatize
them and then process them (using set
?) to ultimately refine my list to:
refined_list = ['drink']
However, I"m stuck on the next step of lemmatization - my method still returns the following:
refinded_list = ['drink', 'drinking']
I tried to reference this but can't figure out what to import so 'lmtzr' works or how to get it to work.
Here's my code so far:
import nltk
words = ['drink', 'drinking']
WNlemma = nltk.WordNetLemmatizer()
refined_list = [WNlemma.lemmatize(t) for t in words]
print(refined_list)
Thank you for helping me.
You need to set pos
tag parameter from lemmatize
as VERB. By default it is NOUN.
So it considers everything as NOUN even if you pass the VERB.
import nltk
words = ['drink', 'drinking']
WNlemma = nltk.WordNetLemmatizer()
refined_list = [WNlemma.lemmatize(t, pos='v') for t in words]
print(refined_list)
Output:
['drink', 'drink']