Search code examples
javamavennullpointerexceptionopennlp

Maven Apache OpenNLP tools getting NullPointerException


I am trying to use Apache OpenNLP with Maven. I add the dependency in the pom

<groupId>org.example</groupId>
<artifactId>nlp-fun</artifactId>
<version>1.0-SNAPSHOT</version>

<properties>
    <maven.compiler.target>1.8</maven.compiler.target>
    <maven.compiler.source>1.8</maven.compiler.source>
</properties>

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.opennlp/opennlp-tools -->
    <dependency>
        <groupId>org.apache.opennlp</groupId>
        <artifactId>opennlp-tools</artifactId>
        <version>1.9.2</version>
    </dependency>

</dependencies>

When I run the following code to create a LanguageDetectorModel object

public class Program {

public void fun() throws Exception{
    InputStream targetStream = new FileInputStream(new File("C:\\Users\\aaa\\Desktop\\nlp-fun\\src\\main\\input.txt"));
    LanguageDetectorModel m = new LanguageDetectorModel(targetStream);
    LanguageDetector myCategorizer = new LanguageDetectorME(m);
}

public static void main(String[] args) throws Exception{
    Program program = new Program();
    program.fun();
}

}

I receive the following NullPointerException and I am not sure what to do. I also tried adding opennlp-tools as a external jar, but that also did not work

Exception in thread "main" java.lang.NullPointerException
at opennlp.tools.util.model.BaseModel.getManifestProperty(BaseModel.java:506)
at opennlp.tools.util.model.BaseModel.initializeFactory(BaseModel.java:248)
at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:234)
at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:176)
at opennlp.tools.langdetect.LanguageDetectorModel.<init>(LanguageDetectorModel.java:50)
at Program.fun(Program.java:18)
at Program.main(Program.java:24)

Solution

  • The targetStream should be the language detection model instead. You can download the model file (langdetect-183.bin) from the OpenNLP website. Next, you can use the model to determine the language of the text. See the OpenNLP User's Manual for an example, also duplicated below:

    InputStream is = new FileInputStream(new File("c:\path\to\langdetect-183.bin"));
    LanguageDetectorModel m = new LanguageDetectorModel(is);
    
    String inputText = "What language is this text?"
    LanguageDetector myCategorizer = new LanguageDetectorME(m);
    
    // Get the most probable language
    Language bestLanguage = myCategorizer.predictLanguage(inputText);
    System.out.println("Best language: " + bestLanguage.getLang());
    System.out.println("Best language confidence: " + bestLanguage.getConfidence());
    
    // Get an array with the most probable languages
    Language[] languages = myCategorizer.predictLanguages(null);