Search code examples
pythonnltkstanford-nlpanacondaspyder

NLTK API to Stanford POSTagger works fine for ipython in terminal but not working in Anaconda with spyder


I have downloaded stanford postagger and parser following the instructions written for below question:

Stanford Parser and NLTK

But when I execute the commands at bottom, it worked perfectly fine for ipython in terminal (Mac OS) but showed error in Spyder(Anaconda) (NLTK was unable to find stanford-postagger.jar!) Since I have set CLASSPATH in terminal, I am not sure what went wrong. When I checked

import os
print os.environ.get('CLASSPATH')

It returned None in Spyder but correct path in terminal. I have also restarted the program and set directory to $HOME. Is there anything I might be missing here?

from nltk.tag.stanford import StanfordPOSTagger
st = StanfordPOSTagger('english-bidirectional-distsim.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())

Solution

  • The problem has nothing to do with python or the nltk; it's a consequence of how OS X starts GUI applications. Basically, the CLASSPATH environment variable is set in your .profile or its kin, but this file is only executed when you are starting Terminal; GUI applications inherit their environment from your login process, which doesn't know CLASSPATH.

    There are numerous SO questions about how to deal with this; see here or here. But in your case, there are also a couple of work-arounds that ought to work:

    1. Start Spyder from the Terminal command line, not via the Launchpad (just type spyder &). Or

    2. Your python program can also set its own environment (which will be inherited by child processes) prior to launching the Stanford parser, like this:

      os.putenv("CLASSPATH", "/path/to/the/parser")