Search code examples
dl4j

How to use an existing DL4J trained model to classify new input


I have a DL4J LSTM model that generates a binary classification of sequential input. i have trained and tested the model and am happy with the precision/recall. Now I want to use this model to predict the binary classification of new inputs. How do I do this? i.e. how do I give the trained neural network a single Input (file containing the sequence of feature rows) and get the binary classification of this input file.

Here is my original training data set iterator:

        SequenceRecordReader trainFeatures = new CSVSequenceRecordReader(0, ",");  //skip no header lines
    try {
        trainFeatures.initialize( new NumberedFileInputSplit(featureBaseDir + "/s_%d.csv", 0,this._modelDefinition.getNB_TRAIN_EXAMPLES()-1));
    } catch (IOException e) {
        trainFeatures.close();
        throw new IOException(String.format("IO error %s. during trainFeatures", e.getMessage()));
    } catch (InterruptedException e) {
        trainFeatures.close();
        throw new IOException(String.format("Interrupted exception error %s. during trainFeatures", e.getMessage()));
    }

    SequenceRecordReader trainLabels = new CSVSequenceRecordReader();
    try {
        trainLabels.initialize(new NumberedFileInputSplit(labelBaseDir + "/s_%d.csv", 0,this._modelDefinition.getNB_TRAIN_EXAMPLES()-1));
    } catch (InterruptedException e) {
        trainLabels.close();
        trainFeatures.close();
        throw new IOException(String.format("Interrupted exception error %s. during trainLabels initialise", e.getMessage()));
    }



    DataSetIterator trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels,
            this._modelDefinition.getBATCH_SIZE(),this._modelDefinition.getNUM_LABEL_CLASSES(), false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);

Here is my model:

        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(this._modelDefinition.getRANDOM_SEED())    //Random number generator seed for improved repeatability. Optional.
            .weightInit(WeightInit.XAVIER)
            .updater(new Nesterovs(this._modelDefinition.getLEARNING_RATE()))
            .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)  //Not always required, but helps with this data set
            .gradientNormalizationThreshold(0.5)
            .list()
            .layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(this._modelDefinition.getNB_INPUTS()).nOut(this._modelDefinition.getLSTM_LAYER_SIZE()).build())
            .layer(1, new LSTM.Builder().activation(Activation.TANH).nIn(this._modelDefinition.getLSTM_LAYER_SIZE()).nOut(this._modelDefinition.getLSTM_LAYER_SIZE()).build())
            .layer(2,new DenseLayer.Builder().nIn(this._modelDefinition.getLSTM_LAYER_SIZE()).nOut(this._modelDefinition.getLSTM_LAYER_SIZE())
                    .weightInit(WeightInit.XAVIER)
                    .build())
            .layer(3, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT)
                    .activation(Activation.SOFTMAX).nIn(this._modelDefinition.getLSTM_LAYER_SIZE()).nOut(this._modelDefinition.getNUM_LABEL_CLASSES()).build())
            .pretrain(false).backprop(true).build();

I train the model over N epochs to get my optimal scores. I save the model, now I want to open the model and get classifications for new sequential feature files.

If there is an example of this - please let me know where.

thanks

anton


Solution

  • The answer is to feed the model the exact same input as we trained with, except set the labels to -1. The output will be an INDarray containing the probability of 0 in one array and the probability of 1 in the other array, showing up in the last sequence line.

    Here is the code:

    public void getOutputsForTheseInputsUsingThisNet(String netFilePath,String inputFileDir) throws Exception {
    
        //open the network file
        File locationToSave = new File(netFilePath);
        MultiLayerNetwork nNet = null;
        logger.info("Trying to open the model");
        try {
            nNet = ModelSerializer.restoreMultiLayerNetwork(locationToSave);
            logger.info("Success: Model opened");
        } catch (IOException e) {
            throw new Exception(String.format("Unable to open model from %s because of error %s", locationToSave.getAbsolutePath(),e.getMessage()));
        }
    
        logger.info("Loading test data");
        SequenceRecordReader testFeatures = new CSVSequenceRecordReader(0, ",");  //skip no lines at the top - i.e. no header
        try {
            testFeatures.initialize(new NumberedFileInputSplit(inputFileDir + "/features/s_4180%d.csv", 0, 4));
        } catch (InterruptedException e) {
            testFeatures.close();
            throw new Exception(String.format("IO error %s. during testFeatures", e.getMessage()));
        }
        logger.info("Loading label data");
        SequenceRecordReader testLabels = new CSVSequenceRecordReader();
        try {
            testLabels.initialize(new NumberedFileInputSplit(inputFileDir + "/labels/s_4180%d.csv", 0,4));
        } catch (InterruptedException e) {
            testLabels.close();
            testFeatures.close();
            throw new IOException(String.format("Interrupted exception error %s. during testLabels initialise", e.getMessage()));
        }
    
    
        //DataSetIterator inputData = new Seque
        logger.info("creating iterator");
    
        DataSetIterator testData =  new SequenceRecordReaderDataSetIterator(testFeatures, testLabels,
                this._modelDefinition.getBATCH_SIZE(),this._modelDefinition.getNUM_LABEL_CLASSES(), false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
    
    
        //now use it to classify some data
        logger.info("classifying examples");
    
        INDArray output = nNet.output(testData);
        logger.info("outputing the classifications");
        if(output==null||output.isEmpty())
            throw new Exception("There is no output");
        System.out.println(output);
    
        //sample output
    

    // [[[ 0, 0, 0, 0, 0.9882, 0, 0, 0, 0], // [ 0, 0, 0, 0, 0.0118, 0, 0, 0, 0]], // // [[ 0, 0.1443, 0, 0, 0, 0, 0, 0, 0], // [ 0, 0.8557, 0, 0, 0, 0, 0, 0, 0]], // // [[ 0, 0, 0, 0, 0, 0, 0, 0, 0.9975], // [ 0, 0, 0, 0, 0, 0, 0, 0, 0.0025]], // // [[ 0, 0, 0, 0, 0, 0, 0.8482, 0, 0], // [ 0, 0, 0, 0, 0, 0, 0.1518, 0, 0]], // // [[ 0, 0, 0, 0.8760, 0, 0, 0, 0, 0], // [ 0, 0, 0, 0.1240, 0, 0, 0, 0, 0]]]

    }