Search code examples
pythonnltk

nltk.download('wordnet') is giving "ParseError: mismatched tag: line 33, column 2" on Python 3.10


In attempting to use nltk.stem.WordNetLemmatizer() I get the error below.

LookupError: 
**********************************************************************
  Resource wordnet not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('wordnet')
  
  For more information see: https://www.nltk.org/data.html

When I go to run this

import nltk
nltk.download('wordnet')

I get this Parse Error

Traceback (most recent call last):

  File ~\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py:3460 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[32], line 2
    nltk.download('wordnet')

  File ~\Anaconda3\lib\site-packages\nltk\downloader.py:777 in download
    for msg in self.incr_download(info_or_id, download_dir, force):

  File ~\Anaconda3\lib\site-packages\nltk\downloader.py:629 in incr_download
    info = self._info_or_id(info_or_id)

  File ~\Anaconda3\lib\site-packages\nltk\downloader.py:603 in _info_or_id
    return self.info(info_or_id)

  File ~\Anaconda3\lib\site-packages\nltk\downloader.py:1009 in info
    self._update_index()

  File ~\Anaconda3\lib\site-packages\nltk\downloader.py:952 in _update_index
    ElementTree.parse(urlopen(self._url)).getroot()

  File ~\Anaconda3\lib\xml\etree\ElementTree.py:1222 in parse
    tree.parse(source, parser)

  File ~\Anaconda3\lib\xml\etree\ElementTree.py:580 in parse
    self._root = parser._parse_whole(source)

  File <string>
ParseError: mismatched tag: line 33, column 2

I ran the code in Jupyter Notebook originally, restarted the kernel and tried again. I also tried running it in the Python interpreter. Every time has given me the same error.

nltk version is 3.7


Solution

  • There may be other solutions to my issue, what I ended up doing to solve this problem was manually downloading wordnet from https://www.nltk.org/nltk_data/ and saving the file where the documentation tells you to (C:\nltk_data\corpora\wordnet)