Search code examples
pythonnlpnltkstemmingsnowball

SnowballStemmer for Russian words list


I do know how to perform SnowballStemmer on a single word (in my case, on russian one). Doing the next things:

from nltk.stem.snowball import SnowballStemmer 

stemmer = SnowballStemmer("russian") 
stemmer.stem("Василий")
'Васил'

How can I do the following if I have a list of words like ['Василий', 'Геннадий', 'Виталий']?

My approach using for loop seems to be not working :(

l=[stemmer.stem(word) for word in l]

Solution

  • Your variable l is not pre-defined, causing the name error. See my last two lines for fix.

    >>> from nltk.stem.snowball import SnowballStemmer
    >>> stemmer = SnowballStemmer("russian") 
    >>> my_words = ['Василий', 'Геннадий', 'Виталий']
    >>> l=[stemmer.stem(word) for word in l]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'l' is not defined
    >>> l=[stemmer.stem(word) for word in my_words]
    >>> l
    ['васил', 'геннад', 'витал']