Search code examples
python-3.xnltkstanford-nlp

Stanford segmenter nltk Could not find SLF4J in your classpath


I've set up a nltk and stanford environment, and nltk and stanford jars has downloaded, the program with nltk was ok, but I had a trouble with stanford segmenter. just make a simple program via stanford segmenter, I got a error is Could not find SLF4J in your classpath, although I had exported all jars including slf4j-api.jar. Detail as follows

  • Python3.5 NLTK 3.2.2 Standford jars 3.7
  • OS: Centos
  • environment variable:

    export JAVA_HOME=/usr/java/jdk1.8.0_60
    export NLTK_DATA=/opt/nltk_data
    export STANFORD_SEGMENTER_PATH=/opt/stanford/stanford-segmenter-3.7
    export CLASSPATH=$CLASSPATH:$STANFORD_SEGMENTER_PATH/stanford-segmenter.jar
    export STANFORD_POSTAGGER_PATH=/opt/stanford/stanford-postagger-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_POSTAGGER_PATH/stanford-postagger.jar
    export STANFORD_NER_PATH=/opt/stanford/stanford-ner-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_NER_PATH/stanford-ner.jar
    export STANFORD_MODELS=$STANFORD_NER_PATH/classifiers:$STANFORD_POSTAGGER_PATH/models
    export STANFORD_PARSER_PATH=/opt/stanford/stanford-parser-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_PARSER_PATH/stanford-parser.jar:$STANFORD_PARSER_PATH/stanford-parser-3.6.0-models.jar:$STANFORD_PARSER_PATH/slf4j-api.jar:$STANFORD_PARSER_PATH/ejml-0.23.jar
    export STANFORD_CORENLP_PATH=/opt/stanford/stanford-corenlp-full-2016-10-31
    export CLASSPATH=$CLASSPATH:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0.jar:$STANFORD_CORENLP_PATH/stanford-corenlp-3.7.0-models.jar:$STANFORD_CORENLP_PATH/javax.json.jar:$STANFORD_CORENLP_PATH/joda-time.jar:$STANFORD_CORENLP_PATH/jollyday.jar:$STANFORD_CORENLP_PATH/protobuf.jar:$STANFORD_CORENLP_PATH/slf4j-simple.jar:$STANFORD_CORENLP_PATH/xom.jar
    export STANFORD_CORENLP=$STANFORD_CORENLP_PATH
    

The program as follows:

from nltk.tokenize import StanfordSegmenter
>>> segmenter = StanfordSegmenter(
    path_to_sihan_corpora_dict="/opt/stanford/stanford-segmenter-3.7/data/",
    path_to_model="/opt/stanford/stanford-segmenter-3.7/data/pku.gz",
    path_to_dict="/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz"
)... ... ... ... 
>>> res = segmenter.segment(u"北海已成为中国对外开放中升起的一颗明星")

The error as follows:

Exception in thread "main" java.lang.ExceptionInInitializerError
    at edu.stanford.nlp.ie.AbstractSequenceClassifier.<clinit>(AbstractSequenceClassifier.java:88)
Caused by: java.lang.IllegalStateException: Could not find SLF4J in your classpath
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:190)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.buildChain(RedwoodConfiguration.java:309)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers$7.apply(RedwoodConfiguration.java:318)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration.lambda$handlers$535(RedwoodConfiguration.java:363)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration.apply(RedwoodConfiguration.java:41)
    at edu.stanford.nlp.util.logging.Redwood.<clinit>(Redwood.java:609)
    ... 1 more
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:364)
    at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:381)
    at edu.stanford.nlp.util.logging.RedwoodConfiguration$Handlers.lambda$static$530(RedwoodConfiguration.java:186)
    ... 6 more
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.util.logging.SLF4JHandler
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.construct(MetaClass.java:135)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:202)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:69)
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:360)
    ... 8 more

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 96, in segment
    return self.segment_sents([tokens])
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 123, in segment_sents
    stdout = self._execute(cmd)
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/tokenize/stanford_segmenter.py", line 143, in _execute
    cmd,classpath=self._stanford_jar, stdout=PIPE, stderr=PIPE)
  File "/usr/local/python3/lib/python3.5/site-packages/nltk/internals.py", line 134, in java
    raise OSError('Java command failed : ' + str(cmd))
OSError: Java command failed : ['/usr/java/jdk1.8.0_60/bin/java', '-mx2g', '-cp', '/opt/stanford/stanford-segmenter-3.7/stanford-segmenter.jar:/opt/stanford/stanford-parser-full-2016-10-31/slf4j-api.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-sighanCorporaDict', '/opt/stanford/stanford-segmenter-3.7/data/', '-textFile', '/tmp/tmpkttpldl6', '-sighanPostProcessing', 'true', '-keepAllWhitespaces', 'false', '-loadClassifier', '/opt/stanford/stanford-segmenter-3.7/data/pku.gz', '-serDictionary', '/opt/stanford/stanford-segmenter-3.7/data/dict-chris6.ser.gz', '-inputEncoding', 'UTF-8']

Thank you in advance!


Solution

  • With the current code base if you have the slf4j-api.jar in your CLASSPATH and run the 3.7.0 segmenter you will get this error. I'm going to push a code change to fix this but for the time being if you remove the slf4j-api.jar from the CLASSPATH this error should go away.