Search code examples
machine-learningneural-networkbackpropagation

When I need to update weights in MultiLayer Perceptron?


Im researching about MultiLayer Perceptrons, a kind of Neural Networks. When I read about Back Propagation Algorithm I see some authors suggest to update weights inmediately after we computed all errors for specific layer, but another authors explain we need to update weights after we get all errors for all layers. What are correct approach?

1st Approach:

function void BackPropagate(){
    ComputeErrorsForOutputLayer();
    UpdateWeightsOutputLayer();
    ComputeErrorsForHiddenLayer();
    UpdateWeightsHiddenLayer();
}

2nd Approach:

function void BackPropagate(){
    ComputeErrorsForOutputLayer();
    ComputeErrorsForHiddenLayer();
    UpdateWeightsOutputLayer();
    UpdateWeightsHiddenLayer();
}

Thanks for everything.


Solution

  • I am pretty sure that you have misunderstood the concept here. Two possible strategies are:

    • update weights after all errors for one input vector are calculated
    • update weights after all errors for all the input vectors are calculated

    which is completely different from what you have written. These two method are sample/batch strategies, both having their pros and cons, due to simplicity the first approach is much more common in implementations.

    Regarding your "methods", second method is the only correct one, process of "propagating" the error is just a computational simplification of computing derivative of error function, and the (basic) process of learning is a steepest descent method. If you compute the derivative only for part of dimensions (output layer), perform a step in the direction, and then recalculate the error derivatives according to new values, you are not performing a gradient descent. The only scenario, where first method is acceptable is when your weights update do not interfer with your error computation, then it does not matter what order is used, as they are independent.