Search code examples
javaneural-networkxor

Neural Network Java XOR learning?


I'm trying to code a neural net in Java with
2 Inputs Neurons , 1 Hidden Layer containing 3 Neurons and 1 Output Neuron,
which should be able to solve XOR function. I understand how a Neuron (Perceptron) works, and how they learn, but I dont understand how a Neural Networks handles outputs, or how Neurons "communicate" with each other.
for Example:

I have this Neural Network:(ignore the values)

Network



and this dataset:
input = {{1, 0},{1, 1},{0, 1},{0, 0}}
ideal = {1, 0, 1, 0}

with which values do I train everys specific Neuron? or make the Neural Network learn?


Solution

  • The goal of "training" a neural network is to find the right weights that correctly predict what the output of an given input is. There are two fundamental processes that go into training a neural network.

    Forward Propagation: In this process we take our weights and inputs at each layer and calculate the output and then apply an activation function. For example, in the neural network you gave, the calculations for the second node in the hidden layer would be:

    1 * 0.4 + 1 * 0.9 = 1.3

    We then apply an activation function on 1.3, our node value. I'm guessing this neural network is using a sigmoid activation function, which is nothing more than a simple exponential function. https://en.wikipedia.org/wiki/Sigmoid_function

    We do this for every node in the hidden layer, and this becomes our input for the next layer, our output layer. After applying the activation function on the output node value, this is what we interpret as the output of the neural network. It will probably be off from what it should be because the initial weights are random and that will give us a random answer. But that leads us to the next process, which will help us find the right weights:

    Back Propagation: This part requires a lot of mathematics and if you're unfamiliar with calculus, it may be difficult to understand at first. But the general idea behind back propagation is to use a method called gradient descent to "make" our weights less wrong by pushing it in the right direction. Explaining gradient descent in full detail would require a long answer in itself. So I'll provide you with some good resources for understanding it:

    http://eli.thegreenplace.net/2016/understanding-gradient-descent/

    All in all these two processes are what "training" a neural network means. The goal is to find the right weights using the training data, so that we can make correct predictions when we face data which we haven't seen before.

    WelchLabs made a really good video series about neural networks and how they work. You should definitely watch the entire series, it'll explain everything, including gradient descent:

    https://www.youtube.com/watch?v=bxe2T-V8XRs