Search code examples
javac++java-native-interfacestanford-nlp

Java Native Interface - C++ is not waiting for java function completion


I want the functionality of the Stanford Core NLP, written in java, to be available in C++. To do this I am making use of the Java Native Interface. I have a Java object that wraps multiple functions in a way that's easier to call from C++. However when I do call those functions, the C++ doesn't wait for the functions to complete before moving onto the next one.

The Java object has a Main function I use for testing, that calls all the appropriate functions for testing purposes. When running just the Java, it works perfectly. The annotation waits for the setup to complete (which does take a while), and the function that gets the dependencies waits for the annotation function to complete. Perfectly expected and correct behavior. The problem comes when I start calling the java functions from C++. Part of the java function will run, but it will quit out and go back to the C++ at certain points, specified below. I would like for the C++ to wait for the java methods to finish.

If it matters, I'm using Stanford Core NLP 3.9.2.

I used the code in StanfordCoreNlpDemo.java that comes with the NLP .jar files as a starting point.

import java.io.*;
import java.util.*;

// Stanford Core NLP imports

public class StanfordCoreNLPInterface {

    Annotation annotation;
    StanfordCoreNLP pipeline;

    public StanfordCoreNLPInterface() {}

    /** setup the NLP pipeline */
    public void setup() {
        // Add in sentiment
        System.out.println("creating properties");
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref, sentiment");
        System.out.println("starting the parser pipeline");
        //<---- doesn't get past this point
        pipeline = new StanfordCoreNLP(props);
        System.out.println("started the parser pipeline");
    }

    /** annotate the text */
    public void annotateText(String text) {
        // Initialize an Annotation with some text to be annotated. The text is the argument to the constructor.
        System.out.println("text");
        System.out.println(text);
        //<---- doesn't get past this point
        annotation = new Annotation(text);
        System.out.println("annotation set");
        // run all the selected annotators on this text
        pipeline.annotate(annotation);
        System.out.println("annotated");
    }

    /** print the dependencies */
    public void dependencies() {
        // An Annotation is a Map with Class keys for the linguistic analysis types.
        // You can get and use the various analyses individually.
        // For instance, this gets the parse tree of the first sentence in the text.
        List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
        if (sentences != null && ! sentences.isEmpty()) {
            CoreMap sentence = sentences.get(0);
            System.out.println("The first sentence dependencies are:");
            SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.EnhancedPlusPlusDependenciesAnnotation.class);
            System.out.println(graph.toString(SemanticGraph.OutputFormat.LIST));
        }
    }

    /** Compile: javac -classpath stanford-corenlp-3.9.2.jar -Xlint:deprecation StanfordCoreNLPInterface.java*/
    /** Usage: java -cp .:"*" StanfordCoreNLPInterface*/
    public static void main(String[] args) throws IOException {
        System.out.println("starting main function");
        StanfordCoreNLPInterface NLPInterface = new StanfordCoreNLPInterface();
        System.out.println("new object");
        NLPInterface.setup();
        System.out.println("setup done");

        NLPInterface.annotateText("Here is some text to annotate");
        NLPInterface.dependencies();
    }
}

I used the code in this tutorial http://tlab.hatenablog.com/entry/2013/01/12/125702 as a starting point.

#include <jni.h>

#include <cassert>
#include <iostream>


/** Build:  g++ -Wall main.cpp -I/usr/lib/jvm/java-8-openjdk/include -I/usr/lib/jvm/java-8-openjdk/include/linux -L${LIBPATH} -ljvm*/
int main(int argc, char** argv) {
    // Establish the JVM variables
    const int kNumOptions = 3;
    JavaVMOption options[kNumOptions] = {
        { const_cast<char*>("-Xmx128m"), NULL },
        { const_cast<char*>("-verbose:gc"), NULL },
        { const_cast<char*>("-Djava.class.path=stanford-corenlp"), NULL },
        { const_cast<char*>("-cp stanford-corenlp/.:stanford-corenlp/*"), NULL }
    };

    // JVM setup before this point.
    // java object is created using env->AllocObject();
    // get the class methods
    jmethodID mid =
        env->GetStaticMethodID(cls, kMethodName, "([Ljava/lang/String;)V");
    jmethodID midSetup =
        env->GetMethodID(cls, kMethodNameSetup, "()V");
    jmethodID midAnnotate =
        env->GetMethodID(cls, kMethodNameAnnotate, "(Ljava/lang/String;)V");
    jmethodID midDependencies =
        env->GetMethodID(cls, kMethodNameDependencies, "()V");
    if (mid == NULL) {
        std::cerr << "FAILED: GetStaticMethodID" << std::endl;
        return -1;
    }
    if (midSetup == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Setup" << std::endl;
        return -1;
    }
    if (midAnnotate == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Annotate" << std::endl;
        return -1;
    }
    if (midDependencies == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Dependencies" << std::endl;
        return -1;
    }
    std::cout << "Got all the methods" << std::endl;

    const jsize kNumArgs = 1;
    jclass string_cls = env->FindClass("java/lang/String");
    jobject initial_element = NULL;
    jobjectArray method_args = env->NewObjectArray(kNumArgs, string_cls, initial_element);

    // prepare the arguments
    jstring method_args_0 = env->NewStringUTF("Get the flask in the room.");
    env->SetObjectArrayElement(method_args, 0, method_args_0);
    std::cout << "Finished preparations" << std::endl;

    // run the function
    //env->CallStaticVoidMethod(cls, mid, method_args);
    //std::cout << "main" << std::endl;
    env->CallVoidMethod(jobj, midSetup);
    std::cout << "setup" << std::endl;
    env->CallVoidMethod(jobj, midAnnotate, method_args_0);
    std::cout << "annotate" << std::endl;
    env->CallVoidMethod(jobj, midDependencies);
    std::cout << "dependencies" << std::endl;
    jvm->DestroyJavaVM();
    std::cout << "destroyed JVM" << std::endl;

    return 0;
}

Compiling the C++ with g++ and -Wall gives no warnings or errors, and neither does compiling the Java with javac. When I run the C++ code I get the following output.

Got all the methods
Finished preparations
creating properties
starting the parser pipeline
setup
text
Get the flask in the room.
annotate
dependencies
destroyed JVM

Following the couts and printlines starting the the C++, you can see how the C++ is able to successfully get the methods and finish JVM and method preparations, before calling the setup method in java. That setup method starts and calls the first printline, creates the properties and assigned the values, then quits before it can start the parser pipeline and goes back to the C++. It's basically the same story moving forward, the annotate text function is called and successfully receives the text from the C++ method call, but quits before it creates the annotation object. I don't have as many debug printlns in dependencies because at that point it doesn't matter, but needless to say none of the existing printlns are called. At the very end the JVM is destroyed and the program ends.

Thank you for any help or insight you can provide.


Solution

  • JNI method calls are always synchronous. When they return before they have reached the end of the method, it's because the code encountered an exception. This doesn't propagate to C++ exceptions automatically. You always have to check for exceptions after every call.

    A common problem for code that runs fine when called from other Java code but not when called with JNI is the VM's classpath. While java.exe will resolve * and add every matching JAR to the classpath, programs using the invocation interface have to do that themselves. The -Djava.class.path in JavaVMOption works with real files only. Also you can only use actual VM options and not arguments like -cp, because they too are only resolved by java.exe and not part of the invocation interface.