Search code examples
pythonnlpnltkwordnet

Fastest way to check if word is in NLTK synsets?


I want to check if certain words is present in NLTK's synsets. The following code does that,

from nltk.corpus import wordnet
if wordnet.synsets(word): 
    ... do something ...

but it is rather slow if you have a lot of words to check. Is there a faster way? I don't need the actual synset object, just a yes/no if there's anything there. I don't have the list of words ahead of time, so I can't precompute the answers.


Solution

  • Since all you need to know is which words match, form a set of all lemmas and look up your words there. Forming the set is very fast (and of course set lookups are even faster).

    wn_lemmas = set(wordnet.all_lemma_names())
    ...
    if word in wn_lemmas:
        <do something>