In ANN the equation during Forward Propagation is Y = W.X + b
.
What is the equation during Forward Propagation for RNN, as it involves States
and Timesteps
.
What is the difference between ANN
and RNN
in terms of Back Propagation.
Also, what is the difference in functionality between Dropout
in ANN vs Recurrent_Dropout
in RNN.
Are there any other key differences between ANN
and RNN
.
The equation for Forward Propagation of RNN, considering Two Timesteps
, in a simple form, is shown below:
Output of the First Time Step: Y0 = (Wx * X0) + b)
Output of the Second Time Step: Y1 = (Wx * X1) + Y0 * Wy + b
where Y0 = (Wx * X0) + b)
To elaborate it, consider RNN
has 5 Neurons/Units
, more detailed equation is mentioned in the screenshot below:
Equation of Forward Propagation of RNN
Back Propagation in RNN:
cost
function C(y(t(min)), y(t(min+1)), ... y(t(max)))
(where tmin
and tmax
are the first and last output time steps, not counting the ignored outputs), and the gradients of that cost function are propagated backward through the unrolled networkIn the screenshot below, Dashed Lines represents Forward Propagation
and Solid Lines represents Back Propagation
.
Flow of Forward Propagation and Back Propagation in RNN
Dropout: If we set the value of Dropout
as 0.1
in a Recurrent Layer
(LSTM), it means that it will pass only 90% of Inputs to the Recurrent Layer
Recurrent Droput If we set the value of Recurrent Dropout
as 0.2
in a Recurrent Layer
(LSTM), it means that it will consider only 80% of the Time Steps for that Recurrent Layer
Hope this answers all your queries!