Search code examples
pythonnltkjupyterstop-wordsjupyterhub

jupyterhub - NLTK - unable to use stopwords - Resource stopwords not found


I am using below code to use stopwords through jupyter notebook. I have hosted jupyter on Linux server and using the notebook.

python3 -m nltk.downloader stopwords
python3 -m nltk.downloader words
python3 -m nltk.downloader punkt

python3
>>>from nltk.corpus import stopwords
>>>stop_words = set(stopwords.words("english"))
>>>print(stop_words)

This works fine while running in python terminal, but when I try below in Jupyternotebook its failing with error.

from nltk.corpus import stopwords
stop_words = set(stopwords.words("english"))
print(stop_words)

---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/nltk/corpus/util.py in __load(self)
     82                 try:
---> 83                     root = nltk.data.find("{}/{}".format(self.subdir, zip_name))
     84                 except LookupError:

/usr/local/lib/python3.7/site-packages/nltk/data.py in find(resource_name, paths)
    582     resource_not_found = "\n%s\n%s\n%s\n" % (sep, msg, sep)
--> 583     raise LookupError(resource_not_found)
    584 

LookupError: 
**********************************************************************
  Resource stopwords not found.
  Please use the NLTK Downloader to obtain the resource:

Solution

  • Try running inside jupyter notebook

    import nltk 
    nltk.download('stopwords')