Search code examples
machine-learningneural-networkbackpropagation

Calculating derivatives with backpropagation using Sutskever's technique


In "TRAINING RECURRENT NEURAL NETWORK" by Ilya Sutskever, there's the following technique for calculating derivatives with backpropagation in feed-forward neural networks.

The network has l hidden layers, l+1 weight matrices and b+1 bias vectors.

"Forward" stage:

enter image description here

"Backwards" stage:

enter image description here

Isn't there an index problem with l+1? for example, in the forward stage we calculate z_l+1 but return z_l.

(Since this is such a major paper, I guess I'm missing something)


Solution

  • There is no problem, some of the indices start at 0 (the variable z for instance), and some start at 1 (the variable x). Follow the algorithm as laid out more carefully, try writing it out by hand explicitly for say l=4.