I'm trying to train with ClearParser and I get this error. Before execute the command I put export CLASSPATH=nlp4j-1.1.0.jar:.
and doing java edu.emory.mathcs.nlp.bin.Version
I get the version info, so it's installed correctly.
Command line: java -Xmx5g -XX:+UseConcMarkSweepGC edu.emory.mathcs.nlp.bin.NLPTrain -mode dep -c config-train-dep.xml -t /home/iago/Escritorio/idiomasClearParser/UD_English/en-ud-train.conllu -d /home/iago/Escritorio/idiomasClearParser/UD_English/en-ud-dev.conllu -m bestModel-dep.xz
I'm using this config file: https://github.com/emorynlp/nlp4j/blob/master/src/main/resources/edu/emory/mathcs/nlp/configuration/config-train-dep.xml
Error: log4j:WARN No appenders could be found for logger (edu.emory.mathcs.nlp.common.util.BinUtils). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. java.io.FileNotFoundException: edu/emory/mathcs/nlp/lexica/en-brown-clusters-simplified-lowercase.xz (No existe el archivo o el directorio) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at edu.emory.mathcs.nlp.common.util.IOUtils.createFileInputStream(IOUtils.java:147) at edu.emory.mathcs.nlp.common.util.IOUtils.getInputStream(IOUtils.java:316) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:82) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:72) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:64) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:55) at edu.emory.mathcs.nlp.bin.NLPTrain$1.createGlobalLexica(NLPTrain.java:108) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:193) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:187) at edu.emory.mathcs.nlp.bin.NLPTrain.train(NLPTrain.java:76) at edu.emory.mathcs.nlp.bin.NLPTrain.main(NLPTrain.java:115) java.io.IOException: Stream closed at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:159) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.tukaani.xz.SingleXZInputStream.initialize(Unknown Source) at org.tukaani.xz.SingleXZInputStream.<init>(Unknown Source) at org.tukaani.xz.XZInputStream.<init>(Unknown Source) at org.tukaani.xz.XZInputStream.<init>(Unknown Source) at edu.emory.mathcs.nlp.common.util.IOUtils.createXZBufferedInputStream(IOUtils.java:220) at edu.emory.mathcs.nlp.common.util.IOUtils.createObjectXZBufferedInputStream(IOUtils.java:259) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:82) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:72) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:64) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:55) at edu.emory.mathcs.nlp.bin.NLPTrain$1.createGlobalLexica(NLPTrain.java:108) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:193) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:187) at edu.emory.mathcs.nlp.bin.NLPTrain.train(NLPTrain.java:76) at edu.emory.mathcs.nlp.bin.NLPTrain.main(NLPTrain.java:115) Exception in thread "main" java.lang.NullPointerException at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2338) at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2351) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2822) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:804) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301) at edu.emory.mathcs.nlp.common.util.IOUtils.createObjectXZBufferedInputStream(IOUtils.java:259) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:82) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.getLexiconFieldPair(GlobalLexica.java:72) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:64) at edu.emory.mathcs.nlp.component.template.util.GlobalLexica.<init>(GlobalLexica.java:55) at edu.emory.mathcs.nlp.bin.NLPTrain$1.createGlobalLexica(NLPTrain.java:108) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:193) at edu.emory.mathcs.nlp.component.template.train.OnlineTrainer.train(OnlineTrainer.java:187) at edu.emory.mathcs.nlp.bin.NLPTrain.train(NLPTrain.java:76) at edu.emory.mathcs.nlp.bin.NLPTrain.main(NLPTrain.java:115)
Why I'm getting this error? I unpacked the .jar and there is no "lexica" folder neither "en-brown-clusters-simplified-lowercase.xz". Where I can found it?
Regards
I found the solution, this error happens because you don't have the "log4j.properties" configured so "nlp4j" can't find it. To fix it only create the file on the same folder that the .jar with this simple code (if you want more details adapt it to your needs)
# Root logger option
log4j.rootLogger=INFO, file, stdout
# Direct log messages to a log file
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File= /path/to/file
log4j.appender.file.MaxFileSize=5MB #Set what you need
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
Moreover, to solve the problem of lexica. Go to this url and download the jars lexica url
Then, on the config xml set the correct path to the jars.
And now it should works. Hope it helps someone.