Search code examples
javamavenpom.xmlstanford-nlp

Stanford CoreNLP version change in pom.xml causing error


I am using Stanford CoreNLP on Ubuntu 14.04 and facing the following issue when I run the following code:

Java Code:

package com.mycompany.app;

import java.io.*; 
import java.util.*;

/*import edu.stanford.nlp.ling.HasWord;
import edu.stanford.nlp.ling.TaggedWord;
import edu.stanford.nlp.parser.shiftreduce.ShiftReduceParser;
import edu.stanford.nlp.process.DocumentPreprocessor;
import edu.stanford.nlp.parser.lexparser.ExhaustivePCFGParser;
import edu.stanford.nlp.trees.Tree;*/

import edu.stanford.nlp.tagger.maxent.MaxentTagger;

public class App
{   
    public static void main(String[] args) throws Exception, NoClassDefFoundError
    {
        MaxentTagger tagger = null;
        if(tagger == null)
        {
            tagger = new MaxentTagger("mymodel.tagger");
        }
        System.out.println("Let's do this!");
    }
}

Command Run:

mvn clean install exec:java -Dexec.mainClass=com.mycompany.app.App

Terminal Output:

[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ dt_mvn ---
[INFO] Building jar: /home/sidharth/Desktop/dt_mvn/target/dt_mvn-1.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-install-plugin:2.3:install (default-install) @ dt_mvn ---
[INFO] Installing /home/sidharth/Desktop/dt_mvn/target/dt_mvn-1.0-SNAPSHOT.jar to /home/sidharth/.m2/repository/com/mycompany/app/dt_mvn/1.0-SNAPSHOT/dt_mvn-1.0-SNAPSHOT.jar
[INFO] Installing /home/sidharth/Desktop/dt_mvn/pom.xml to /home/sidharth/.m2/repository/com/mycompany/app/dt_mvn/1.0-SNAPSHOT/dt_mvn-1.0-SNAPSHOT.pom
[INFO] 

[INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ dt_mvn ---
Reading POS tagger model from mymodel.tagger ... [WARNING] 
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
    at java.lang.Thread.run(Thread.java:745)
Caused by: edu.stanford.nlp.io.RuntimeIOException: java.io.StreamCorruptedException: invalid stream header: 00048E4D
    at edu.stanford.nlp.maxent.iis.LambdaSolve.read_lambdas(LambdaSolve.java:726)
    at edu.stanford.nlp.tagger.maxent.LambdaSolveTagger.<init>(LambdaSolveTagger.java:76)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:863)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:767)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:298)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:263)
    at com.mycompany.app.App.main(App.java:22)
    ... 6 more
Caused by: java.io.StreamCorruptedException: invalid stream header: 00048E4D
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:806)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
    at edu.stanford.nlp.maxent.iis.LambdaSolve.read_lambdas(LambdaSolve.java:719)
    ... 12 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.260s
[INFO] Finished at: Sun Dec 20 01:34:29 IST 2015
[INFO] Final Memory: 23M/228M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (default-cli) on project dt_mvn: An exception occured while executing the Java class. null: InvocationTargetException: java.io.StreamCorruptedException: invalid stream header: 00048E4D -> [Help 1]

pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.app</groupId>
  <artifactId>dt_mvn</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>dt_mvn</name>
  <url>http://maven.apache.org</url>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>edu.stanford.nlp</groupId>
      <artifactId>stanford-corenlp</artifactId>
      <version>3.5.2</version>
    </dependency>
  </dependencies>
</project>

However, changing version in pom.xml from 3.5.2 to 1.3.0 works correctly. What could be the reason for this?

Thanks!

P.S. If it's of any use, the maven project was created by the following command:

mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=dt_mvn -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Solution

  • I suspect however you built "mymodel.tagger" is incompatible with later versions so the deserialization is failing. What version did you use to build my_model.tagger? Does your Maven project fail if you use 1.3.5 ? I can see changes in our repo in LambdaSolve.java involving serialization in the time period between 1.3.1 and 1.3.5 so I suspect that is the reason.

    Also make sure you use Java 1.8 with Stanford CoreNLP 3.5.0 or later.