I'm reading an article about machine learning theory, there are has a step as below to calculate partial derivative:
∂(w5 * h1 + w6 * h2 + b3) / ∂h1 = w5 * f′(w5 * h1 + w6 * h2 + b3)
As I understand the caclulation method of partial derivative, the result of ∂(w5 * h1 + w6 * h2 + b3) / ∂h1 should be w5, should not be w5 * f′(w5 * h1 + w6 * h2 + b3), I'm very confused this step, could you tell me some explain? Thank you.
I believe there is a typo in the formula that you are reading.
∂f(w5 * h1 + w6 * h2 + b3) / ∂h1 = w5 * f′(w5 * h1 + w6 * h2 + b3)
I believe there should be an f on the LHS of the equation, after which, chain rule is being applied.
If f
is absent, then you are right.