Search code examples
pythonnlpnltkcorpus

NLTK cannot import a specific corpus (pl196x)


My NLTK mysteriously refuses to import corpus pl196x (included in the standard package of corpora). When I do from nltk.corpus import brown everything goes smooth, but with from nltk.corpus import pl196x, it is always

Traceback (most recent call last): File "<input>", line 1, in <module> ImportError: cannot import name 'pl196x' from 'nltk.corpus' (C:\my\path\to\__init__.py)

and it already happened on multiple PCs and OSs.

  1. I did nltk.download() all corpora, they show as downloaded
  2. The files are there

nltk data folder

  1. I checked the nltk.data.path and it does contain 'C:\\nltk_data'

I have no idea what is wrong - currently the only possible explanation for me is that the corpus was somehow discontinued. Any pointers will be highly appreciated.


Solution

  • The appropriate way to import pl196x corpus is using:

    from nltk.corpus.reader import pl196x
    

    This is hinted by the documentation of the module.