I was planning to make my own neural net lib in C++, and I was going thru other people codes to make sure I am on right track... Below is a sample code, that I am trying to learn from..
Everything in that code made sense, except for the gradient descend part, in which he literally updating weights by adding with positive learning rate... Shouldn't we need to take negative of the gradient to reach the optima??
Line number : 137 - 157.
double Neuron::eta = 0.10;
void Neuron::updateInputWeights(Layer &prevLayer)
{
// The weights to be updated are in the Connection container
// in the nuerons in the preceding layer
for(unsigned n = 0; n < prevLayer.size(); ++n)
{
Neuron &neuron = prevLayer[n];
double oldDeltaWeight = neuron.m_outputWeights[m_myIndex].deltaWeight;
double newDeltaWeight =
// Individual input, magnified by the gradient and train rate:
eta
* neuron.getOutputVal()
* m_gradient
// Also add momentum = a fraction of the previous delta weight
+ alpha
* oldDeltaWeight;
// updating Weights
neuron.m_outputWeights[m_myIndex].deltaWeight = newDeltaWeight;
neuron.m_outputWeights[m_myIndex].weight += newDeltaWeight;
}
}
Everything in there is just adding things for the weight updation, there is not negative sign in there.
https://github.com/huangzehao/SimpleNeuralNetwork/blob/master/src/neural-net.cpp
Good thing is it works fine, which is making me weird....
I asked this question to everybody I know of, they all got confused.
Here is the video representation of creating neural net lib... same code as above one.
Yeah, this is indeed confusing but I think that the crux in this line. (I may be wrong but if you say that the training is working then the only line which could possibly alter the signs should be this.)
eta * neuron.getOutputVal() * m_gradient
where neuron.getOutputVal()
provides the direction to the update.