Search code examples
machine-learningneural-networkdeep-learningperceptron

Multilayer perceptron always picks the last class it was trained to specify. Backpropagation


I'm trying to write a MLP that classifies input into three objects. I have a number that represents each object.

1-10 : Banana
11-20 : Apple
21:30 : Carrot

There's only two layers in MLP: one hidden layer (2 units) and one output layer (3 units).
Each unit has:

  • inputs[] (inputs that were passed to this unit)
  • weights[]
  • delta
  • sum (summed up weights with inputs)
  • output (activated sum)

also each unit has an activation function:

double activate(double[] inputs) {
        this.inputs = inputs;
        sum = 0;

        for (int i = 0; i < inputs.length; i++)
            sum += weights[i] * inputs[i];

        output = 1.0 / (1.0 + (Math.exp(-sum))); // activation

        return output;
    }

and a function to correct weights:

void correctWeights(double momentum, double learningRate) {
        for (int i = 0; i < weights.length; i++) {
            weights[i] = weights[i] * momentum + learningRate * delta * (output * (1 - output)) * inputs[i];
        }
    }

where (output * (1 - output)) is the derivative.

To train the network I have a function that loops N times, in the loop I generate the input relative to the object, then propagate it to the network and use back-propagation.

private void train() {
        for (int i = 0; i < 10000; i++) {
            int[] expectedOutput = new int[3];
            double[] inputs = {ThreadLocalRandom.current().nextInt(1, 30 + 1)};
            if (inputs[0] <= 10) {
                expectedOutput[0] = 1;
                expectedOutput[1] = 0;
                expectedOutput[2] = 0;
            }
            if (inputs[0] <= 20 && inputs[0] > 10) {
                expectedOutput[0] = 0;
                expectedOutput[1] = 1;
                expectedOutput[2] = 0;
            }
            if (inputs[0] <= 30 && inputs[0] > 20) {
                expectedOutput[0] = 0;
                expectedOutput[1] = 0;
                expectedOutput[2] = 1;
            }
            double[] outputs = propagate(inputs);
            backPropagate(expectedOutput, outputs);
        }
    }

Propagation function just goes through the whole net and activates the units.

private double[] propagate(double[] inputs) {
        double[] hiddenOutputs = new double[hiddenLayer.length];
        for (int i = 0; i < hiddenLayer.length; i++)
            hiddenOutputs[i] = hiddenLayer[i].activate(inputs);

        double[] outputs = new double[outputLayer.length];
        for (int i = 0; i < outputs.length; i++)
            outputs[i] = outputLayer[i].activate(hiddenOutputs);

        return outputs;
    }

Back-propagation algorithm was taken from http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html

private void backPropagate(int[] expectedOutput, double[] output) {
        for (int i = 0; i < outputLayer.length; i++) {
            outputLayer[i].setDelta(expectedOutput[i] - output[i]);
        }

        for (int i = 0; i < hiddenLayer.length; i++) {
            double delta = 0;
            for (int j = 0; j < outputLayer.length; j++) {
                delta += outputLayer[j].getDelta() * outputLayer[j].getWeight(i);
            }
            hiddenLayer[i].setDelta(delta);
        }

        for (int i = 0; i < hiddenLayer.length; i++)
            hiddenLayer[i].correctWeights(momentum, learningRate);

        for (int i = 0; i < outputLayer.length; i++)
            outputLayer[i].correctWeights(momentum, learningRate);
    }

It also has a function to recognize objects after it got trained

private void recognize(String number) {
        double[] inputs = {Double.parseDouble(number)};
        double[] outputs = propagate(inputs);
        System.out.println("Banana: " + outputs[0]);
        System.out.println("Apple: " + outputs[1]);
        System.out.println("Carrot: " + outputs[2]);
    }

So the problem is that when I pass any number to the recognize function I get the output similar to this:

Banana: 0.49984367018594233
Apple: 0.49984367018594233
Carrot: 0.5001563298140577

Carrot is being chosen every time (also carrot is the last trained object by the net). So if I input 5 it will output that it's a carrot. if I input 15 it will output that it's a carrot. If I change the order of the objects that are being learned in train function and make the banana to be the last learned object then the net will always pick the banana as its answer.

I've been working on this for a few days now and I couldn't find any solution to it, please help me, what am I doing wrong?


Solution

  • I notice that you select a random number between 0-30 and then determine an output for it, however you are forgetting to normalize the input. Neural networks function best if the input is within the range of 0-1 (depends on which activation function you use).

    So what is left for you to do is to do normalize this input. That means, converting the inputs equally to a number between 0 and 1.

    Your input is a numerical value, so all you have to do is choose a maximum value with which you divide all the values. In your case, this could be 30, as there is no input higher than 30. So each number gets converted as follows:

    10 -> 10 / 30 -> 0.33
    15 -> 15 / 30 -> 0.50
    etc.
    

    Read more about normalization here.