Search code examples
c++c++11neural-networkbackpropagation

Neural Network gives same output for different inputs, doesn't learn


I have a neural network written in standard C++11 which I believe follows the back-propagation algorithm correctly (based on this). If I output the error in each step of the algorithm, however, it seems to oscillate without dampening over time. I've tried removing momentum entirely and choosing a very small learning rate (0.02), but it still oscillates at roughly the same amplitude per network (with each network having a different amplitude within a certain range).

Further, all inputs result in the same output (a problem I found posted here before, although for a different language. The author also mentions that he never got it working.)

The code can be found here.

To summarize how I have implemented the network:

  • Neurons hold the current weights to the neurons ahead of them, previous changes to those weights, and the sum of all inputs.
  • Neurons can have their value (sum of all inputs) accessed, or can output the result of passing said value through a given activation function.
  • NeuronLayers act as Neuron containers and set up the actual connections to the next layer.
  • NeuronLayers can send the actual outputs to the next layer (instead of pulling from the previous).
  • FFNeuralNetworks act as containers for NeuronLayers and manage forward-propagation, error calculation, and back-propagation. They can also simply process inputs.
  • The input layer of an FFNeuralNetwork sends its weighted values (value * weight) to the next layer. Each neuron in each layer afterwards outputs the weighted result of the activation function unless it is a bias, or the layer is the output layer (biases output the weighted value, the output layer simply passes the sum through the activation function).

Have I made a fundamental mistake in the implementation (a misunderstanding of the theory), or is there some simple bug I haven't found yet? If it would be a bug, where might it be?

Why might the error oscillate by the amount it does (around +-(0.2 +- learning rate)) even with a very low learning rate? Why might all the outputs be the same, no matter the input?

I've gone over most of it so much that I might be skipping over something, but I think I may have a plain misunderstanding of the theory.


Solution

  • It turns out I was just staring at the FFNeuralNetwork parts too much and accidentally used the wrong input set to confirm the correctness of the network. It actually does work correctly with the right learning rate, momentum, and number of iterations.

    Specifically, in main, I was using inputs instead of a smaller array in to test the outputs of the network.