Search code examples
javamavenbenchmarkingjmhdl4j

Benchmarking my neural network with JMH, but how do I mix my maven dependencies?


I followed this guide (http://tutorials.jenkov.com/java-performance/jmh.html) and have opened a new project with that class MyBenchmark which looks like this:

package com.jenkov;

import org.openjdk.jmh.annotations.Benchmark;
import org.datavec.api.records.reader.RecordReader;
import org.datavec.api.records.reader.impl.csv.CSVRecordReader;
import org.datavec.api.split.FileSplit;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.PerformanceListener;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.deeplearning4j.examples.utils.DownloaderUtility;
import org.nd4j.evaluation.classification.Evaluation;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.api.ndarray.INDArray;
import org.nd4j.linalg.dataset.DataSet;
import org.nd4j.linalg.dataset.SplitTestAndTrain;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.preprocessor.DataNormalization;
import org.nd4j.linalg.dataset.api.preprocessor.NormalizerStandardize;
import org.nd4j.linalg.learning.config.Sgd;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.File;
import java.io.File;

public class MyBenchmark {

    @Benchmark
    public void testMethod() {


            Logger log = LoggerFactory.getLogger(MyBenchmark.class);


                //First: get the dataset using the record reader. CSVRecordReader handles loading/parsing
                int numLinesToSkip = 0;
                char delimiter = ',';
                RecordReader recordReader = new CSVRecordReader(numLinesToSkip,delimiter);
                recordReader.initialize(new FileSplit(new File(DownloaderUtility.IRISDATA.Download(),"iris.txt")));

                //Second: the RecordReaderDataSetIterator handles conversion to DataSet objects, ready for use in neural network
                int labelIndex = 4;     //5 values in each row of the iris.txt CSV: 4 input features followed by an integer label (class) index. Labels are the 5th value (index 4) in each row
                int numClasses = 3;     //3 classes (types of iris flowers) in the iris data set. Classes have integer values 0, 1 or 2
                int batchSize = 150;    //Iris data set: 150 examples total. We are loading all of them into one DataSet (not recommended for large data sets)

                DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader,batchSize,labelIndex,numClasses);
                DataSet allData = iterator.next();
                allData.shuffle();
                SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.65);  //Use 65% of data for training

                DataSet trainingData = testAndTrain.getTrain();
                DataSet testData = testAndTrain.getTest();

                //We need to normalize our data. We'll use NormalizeStandardize (which gives us mean 0, unit variance):
                DataNormalization normalizer = new NormalizerStandardize();
                normalizer.fit(trainingData);           //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
                normalizer.transform(trainingData);     //Apply normalization to the training data
                normalizer.transform(testData);         //Apply normalization to the test data. This is using statistics calculated from the *training* set


                final int numInputs = 4;
                int outputNum = 3;
                long seed = 6;


                log.info("Build model....");
                MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                    .seed(seed)
                    .activation(Activation.TANH)
                    .weightInit(WeightInit.XAVIER)
                    .updater(new Sgd(0.1))
                    .l2(1e-4)
                    .list()
                    .layer(new DenseLayer.Builder().nIn(numInputs).nOut(3)
                        .build())
                    .layer(new DenseLayer.Builder().nIn(3).nOut(3)
                        .build())
                    .layer( new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .activation(Activation.SOFTMAX) //Override the global TANH activation with softmax for this layer
                        .nIn(3).nOut(outputNum).build())
                    .build();

                //run the model
                MultiLayerNetwork model = new MultiLayerNetwork(conf);
                model.init();
                //record score once every 100 iterations
                model.setListeners(new ScoreIterationListener(100));
                model.setListeners(new PerformanceListener(100));
                for(int i=0; i<1000; i++ ) {
                    model.fit(trainingData);
                }

                //evaluate the model on the test set
                Evaluation eval = new Evaluation(3);
                INDArray output = model.output(testData.getFeatures());
                eval.eval(testData.getLabels(), output);
                log.info(eval.stats());

            }

        }

and there goes my pom.xml with the deeplearning4j-examples dependency in it:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.jenkov</groupId>
<artifactId>first-benchmark</artifactId>
<version>1.0</version>
<packaging>jar</packaging>

<name>JMH benchmark sample: Java</name>

<!--
   This is the demo/sample template build script for building Java benchmarks with JMH.
   Edit as needed.
-->

<dependencies>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-core</artifactId>
        <version>${jmh.version}</version>
    </dependency>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-generator-annprocess</artifactId>
        <version>${jmh.version}</version>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>org.datavec</groupId>
        <artifactId>datavec-api</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-datasets</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-examples</artifactId>
        <version>0.0.3.1</version>
    </dependency>
    <dependency>
        <groupId>org.deeplearning4j</groupId>
        <artifactId>deeplearning4j-nn</artifactId>
        <version>1.0.0-beta7</version>
    </dependency>
    <dependency>
        <groupId>com.jenkov</groupId>
        <artifactId>first-benchmark</artifactId>
        <version>1.0</version>
    </dependency>
</dependencies>

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

    <!--
        JMH version to use with this project.
      -->
    <jmh.version>1.28</jmh.version>

    <!--
        Java source/target to use for compilation.
      -->
    <javac.target>1.8</javac.target>

    <!--
        Name of the benchmark Uber-JAR to generate.
      -->
    <uberjar.name>benchmarks</uberjar.name>
</properties>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.8.0</version>
            <configuration>
                <compilerVersion>${javac.target}</compilerVersion>
                <source>${javac.target}</source>
                <target>${javac.target}</target>
            </configuration>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.2.1</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <finalName>${uberjar.name}</finalName>
                        <transformers>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <mainClass>org.openjdk.jmh.Main</mainClass>
                            </transformer>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                        </transformers>
                        <filters>
                            <filter>
                                <!--
                                    Shading signed JARs will fail without this.
                                    http://stackoverflow.com/questions/999489/invalid-signature-file-when-attempting-to-run-a-jar
                                -->
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
    <pluginManagement>
        <plugins>
            <plugin>
                <artifactId>maven-clean-plugin</artifactId>
                <version>2.5</version>
            </plugin>
            <plugin>
                <artifactId>maven-deploy-plugin</artifactId>
                <version>2.8.1</version>
            </plugin>
            <plugin>
                <artifactId>maven-install-plugin</artifactId>
                <version>2.5.1</version>
            </plugin>
            <plugin>
                <artifactId>maven-jar-plugin</artifactId>
                <version>2.4</version>
            </plugin>
            <plugin>
                <artifactId>maven-javadoc-plugin</artifactId>
                <version>2.9.1</version>
            </plugin>
            <plugin>
                <artifactId>maven-resources-plugin</artifactId>
                <version>2.6</version>
            </plugin>
            <plugin>
                <artifactId>maven-site-plugin</artifactId>
                <version>3.3</version>
            </plugin>
            <plugin>
                <artifactId>maven-source-plugin</artifactId>
                <version>2.2.1</version>
            </plugin>
            <plugin>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.17</version>
            </plugin>
        </plugins>
    </pluginManagement>
</build>

Now as far as i understood, I have to put my neural network code in there (which is a DL4J-Example-NN), run “mvn clean install” in the terminal and run “java -jar target/benchmarks.jar” there, BUT My.Benchmark.java cant import

import org.deeplearning4j.examples.utils.DownloaderUtility;

even though i declared the dependency

<dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-examples</artifactId>
            <version>0.0.3.1</version>
        </dependency>

for it What am I missing ? Is this method even correct?

Thanks alot


Solution

  • This can be a bit tricky and I don't know exactly what you have done. But here is how to get it working.

    1. Make a simple hello world jmh type project and test that it works.

    2. Download the dl4j examples github and test an example to make sure it works.

    3. Now that both are working go into the DL4J examples and notice how different packages have different pom files. Additionally there is a master pom for the entire dl4j parent. YOU SHOULD USE THE DL4J MASTER EXAMPLE POM! Do not attempt to hand create your own

    4. My suggestion is to create a new project. Use the dl4j master as your parent pom file and then add a second pom file within the directory of your project. You can clearly see this done in the dl4j examples github which seems like the best practice for your situation as well.