Search code examples
artificial-intelligenceneural-networkbackpropagation

Calculate the error using a sigmoid function in backpropagation


I have a quick question regarding backpropagation. I am looking at the following:

http://www4.rgu.ac.uk/files/chapter3%20-%20bp.pdf

In this paper, it says to calculate the error of the neuron as

Error = Output(i) * (1 - Output(i)) * (Target(i) - Output(i))

I have put the part of the equation that I don't understand in bold. In the paper, it says that the Output(i) * (1 - Output(i)) term is needed because of the sigmoid function - but I still don't understand why this would be nessecary.

What would be wrong with using

Error = abs(Output(i) - Target(i))

?

Is the error function regardless of the neuron activation/transfer function?


Solution

  • The reason you need this is that you are calculating the derivative of the error function with respect to the neuron's inputs.

    When you take the derivative via the chain rule, you need to multiply by the derivative of the neuron's activation function (which happens to be a sigmoid)

    Here's the important math.

    Calculate the derivative of the error on the neuron's inputs via the chain rule:

    E = -(target - output)^2
    
    dE/dinput = dE/doutput * doutput/dinput
    

    Work out doutput/dinput:

    output = sigmoid (input)
    
    doutput/dinput = output * (1 - output)    (derivative of sigmoid function)
    

    therefore:

    dE/dinput = 2 * (target - output) * output * (1 - output)