I am sure this has a simple answer! I am asking to improve my understanding.
A diagram: a modification of: CS231N Back Propagation
If the Cain Rule is applied to get the Delta for Y, the Gradient will be: dy = -4
according to the Diagram.
Applying Chain Rule Notation: df/dy = df/dq * dq/dy
Numerically:
double x = -2;
double y = 5;
double q = 3;
double z = -4;
double f = -12;
double df = 1;
double dz = 3;
double dq = -4;
double dy = df * dq;
double dx = df * dq;
Where: df = df/df = 1
as shown above, and dq = df/dq = -4
as shown above. Thus: 1(df) * -4(dq) = -4(dy)
. Or have I got this completely wrong?
Where are the Numerical Values actually coming from, where in the diagram? Is this a Gradient Only Numerical chain or are we deriving from the other input values? The reason I ask here, is because on Page 48, there is a slightly confusing Code Example:
I am looking at the (/) sign, df/dy
, as a division, and I think this is wrong? df/dy = df/dq * dq/dy
= 1/-4 * -4/-4
= 0.25 - What is the purpose of one number over the other here?
Is it that df/dy
= dy
are they the same things, symbolising dy
of df
, meaning one Gradient Flowing Back in Time?
Apologies, I am somewhat confused.
A refresher on Differential Equations helped clear up the confusion: https://www.khanacademy.org/math/differential-equations/first-order-differential-equations/differential-equations-intro/v/differential-equation-introduction
Confusion is the greatest problem for learning!