Search code examples
pythonpython-3.xnltklemmatization

Iterate and Lemmatize List


I'm a newbie and struggling with what I'm sure is a simple task.

I have a list of words taken from POS tagging:

words = ['drink', 'drinking']

And I want to lemmatize them and then process them (using set?) to ultimately refine my list to:

refined_list = ['drink']

However, I"m stuck on the next step of lemmatization - my method still returns the following:

refinded_list = ['drink', 'drinking']

I tried to reference this but can't figure out what to import so 'lmtzr' works or how to get it to work.

Here's my code so far:

import nltk
words = ['drink', 'drinking']
WNlemma = nltk.WordNetLemmatizer()
refined_list = [WNlemma.lemmatize(t) for t in words]
print(refined_list)

Thank you for helping me.


Solution

  • You need to set pos tag parameter from lemmatize as VERB. By default it is NOUN. So it considers everything as NOUN even if you pass the VERB.

    import nltk
    words = ['drink', 'drinking']
    WNlemma = nltk.WordNetLemmatizer()
    refined_list = [WNlemma.lemmatize(t, pos='v') for t in words]
    print(refined_list)
    

    Output:

    ['drink', 'drink']