java neural-network multiclass-classification dl4j

How to Configure Neural Network to Produce Multiple Binary Outputs in DL4J

I am learning DL4J, and I would like to configure a network that can accept a tuple of double values, and produce a tuple of binary values, in which multiple of them may be set to 1, and others set to 0. In the language of neural nets, would I cal this is multi-class one-hot encoding?

Example:

[3.5,  2.9, 15.0] -> [0, 0, 1, 0, 1]
[2.5, 12.5,  5.0] -> [1, 1, 0, 0, 1]
[5.9, 71.3,  0.7] -> [0, 1, 1, 0, 0]

etc.

I have tried this:

MultiLayerConfiguration multiLayerConfiguration = 
    new NeuralNetConfiguration.Builder()
        .seed(System.nanoTime())
        .iterations(10000)
        .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
        .learningRate(0.1)
        .useDropConnect(false)
        .biasInit(0)
        .miniBatch(false)
        .updater(Updater.NESTEROVS)
        .list()
        .layer(0, new DenseLayer.Builder()
            .nIn(3)
            .nOut(8)
            .weightInit(WeightInit.XAVIER)
            .activation(Activation.SIGMOID)
            .build())
        .layer(1, new OutputLayer.Builder()
            .nIn(8)
            .nOut(5)
            .weightInit(WeightInit.XAVIER)
            .activation(Activation.SOFTMAX)
            .lossFunction(LossFunctions.LossFunction.RECONSTRUCTION_CROSSENTROPY)
            .build())
        .pretrain(false)
        .backprop(true)
        .build();

But I seem to be getting fractional values in the output as if the network is trying to evenly distribute the activation. How do I configure the network to make it give me multiple 1's and 0's as a classification?

For example, if the output was 3 dimensional, I would want this:

[[0.00,  0.49,  0.51],  
 [0.50,  0.00,  0.50],  
 [0.50,  0.50,  0.00],  
 [0.33,  0.33,  0.34],  
 [0.00,  0.00,  1.00]]

To really be this:

[[0.00,  1.00,  1.00],  
 [1.00,  0.00,  1.00],  
 [1.00,  1.00,  0.00],  
 [1.00,  1.00,  1.00],  
 [0.00,  0.00,  1.00]]

Solution

You shouldn't use a softmax output for binary or multi class. You sigmoid and binary xent instead.

Also, this code looks a bit old. Make sure you are using 0.9.1. Don't use reconstruction cross entropy -> use KL Divergence if you are doing unsupervised learning (autoencoders and the like) but for this case you shouldn't even be using recon error.

Also, the iterations knob is going away next release. Use for loops instead. That iterations knob is legacy (just leave it at 1 in the mean time)

Again, I strongly encourage you to follow our examples closer. We have everything you need in there for multi class classification or really any use case. If you can't find something try to do a keyword search in the repo. Failing that, ask on here our on our community gitter.