Search code examples
linuxubuntudeep-learningdeeplearning4jnd4j

Can't build ParagraphVectors in Linux


I'm using the Doc2Vec algorithm with Deeplearning4j and it works fine when I run it on my Windows 10 PC, however when I try to run it on a Linux box, i get the following error:

java.lang.NoClassDefFoundError: Could not initialize class org.nd4j.linalg.factory.Nd4j
at org.deeplearning4j.models.embeddings.inmemory.InMemoryLookupTable$Builder.<init>(InMemoryLookupTable.java:581) ~[run.jar:?]
at org.deeplearning4j.models.sequencevectors.SequenceVectors$Builder.presetTables(SequenceVectors.java:801) ~[run.jar:?]
at org.deeplearning4j.models.paragraphvectors.ParagraphVectors$Builder.build(ParagraphVectors.java:663) ~[run.jar:?]

I've tried this on a couple of Linux machines, both of which were running Xubuntu and had sudo permissions

Here is the code for creating my ParagraphVectors: InputStream is = new ByteArrayInputStream(baos.toByteArray());

  LabelAwareSentenceIterator iter;
  iter = new LabelAwareListSentenceIterator(is, DELIM);
  iter.setPreProcessor(new SentencePreProcessor() {
    @Override
    public String preProcess(String sentence) {
      return new InputHomogenization(sentence).transform();
    }
  });

  TokenizerFactory tokenizerFactory = new DefaultTokenizerFactory();
  vec = new ParagraphVectors.Builder().minWordFrequency(minWordFrequency).batchSize(batchSize)
      .iterations(iterations).layerSize(layerSize).stopWords(stopWords).windowSize(windowSize)
      .learningRate(learningRate).tokenizerFactory(tokenizerFactory).iterate(iter).build();
  vec.fit();

And here is my pom.xml (versions are all 0.7.1, but I had been using 0.4-rc3.9 and got the same error) :

<dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-ui-model</artifactId>
        <version>${dl4j.version}</version>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-nlp</artifactId>
        <version>${dl4j.version}</version>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>org.nd4j</groupId>
        <artifactId>nd4j-native</artifactId>
        <version>${nd4j.version}</version>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.datavec/datavec-api -->
    <dependency>
        <groupId>org.datavec</groupId>
        <artifactId>datavec-api</artifactId>
        <version>${nd4j.version}</version>
    </dependency>

Solution

  • Always stick to the latest version first of all. Could you post the full stack trace? This is definitely not the root cause. Maybe try using nd4j-native-platform instead? Usually this is a problem with missing native artifacts.