Search code examples
apache-sparkneural-networkapache-spark-mllibbackpropagationfeed-forward

Why is spark library using outputs(i+1) in MultilayerPerceptron for previous Delta Calculations


Looking at this code

for (i <- (L - 2) to (0, -1)) {
    layerModels(i + 1).computePrevDelta(deltas(i + 1), outputs(i + 1), deltas(i))
}

I want to understand why are we passing outputs(i+1) instead of outputs(i) in the code snippet above. As far as I understand this is only needed for sigmoid activation layer which has a derivative as f'(x) = f(x) * (1-f(x)) = outputs(i) * (1-outputs(i))

Which means in order to find prevDelta we should be using outputs(i).


Solution

  • I figured why it is so. I will answer here if someone like me stumbles here by chance.

    You have to notice that we are calculating delta for layer i which only depends on next (i+1 th) layer's delta and gradient. You have to notice that we are using layerModels(i + 1) as needed and not layerModels(i)