I'd like to write a simple function to see if this word 'exists' in WordNet via NLTK.
def is_known(word):
"""return True if this word "exists" in WordNet
(or at least in nltk.corpus.stopwords)."""
if word.lower() in nltk.corpus.stopwords.words('english'):
return True
synset = wn.synsets(word)
if len(synset) == 0:
return False
else:
return True
Why would words like could, since, without, although
return False? Don't they appear in WordNet? Is there any better way to find out whether a word exists in WN (using NLTK)?
My first try was to eliminate "stopwords" which are words like to, if, when, then, I, you
, but there are still very common words (like could
) which I can't find.
WordNet does not contain these words or words like them. For an explanation, see the following from the WordNet docs:
Q. Why is WordNet missing: of, an, the, and, about, above, because, etc.
A. WordNet only contains "open-class words": nouns, verbs, adjectives, and adverbs. Thus, excluded words include determiners, prepositions, pronouns, conjunctions, and particles.
You also won't find these kinds of words in the online version of WordNet.