In backpropagation of a neural network having sigmoid activation function,
Weight updation rule is given by:
NewWeight = OldWeight - alpha * D * A
Where alpha is learning rate, A is Activations from previous layer,
D = (Y - Y')Y'(1-Y') ;D = Error Minimization, delta
where Y = given value and Y' is computed in the neural network by the output layer
Y in my case is 4x1 = [0.3,0.2,0.4,0.1] and an instance of Y' is 4x1= [0.2,0.1,0.1,0.2]
How do I compute D = (Y - Y')Y'(1-Y')
(Y-Y') = 4x1 and Y' = 4x1, and 1 <> 4 hence matrix multiplication is not possible. Also (1-Y') is 4x1. How can i multiply {(Y - Y'),Y',(1-Y')} to obtain D? If I have to perform Transpose which matrix should I transpose so that net effect is unchanged?
Or is it a elementwise multiplication?
It is indeed elementwise multiplication. You need to multiply each output error (Y-Y') by the derivative of the respective output (w.r.t. the weights), which is Y'(1-Y') for the sigmoid activation. Think about it as a "corrected error signal". So, D * A is a vector outer product, which will give you a matrix with the same size as the weights.