Search code examples
pythonpython-3.xmathneural-networkbackpropagation

Working with backpropagation algorithm using softmax function in the neural network


I am creating a Neural Network from scratch for MNIST data, so I have 10 classes in the output layer. I need to perform backpropagation and for that, I need to calculate dA*dZ for the last layer where dA is the derivative of the loss function L wrt the softmax activation function A and dZ is the derivative of the softmax activation functionA wrt to z where z=wx+b. The size obtained for dA is 10*1 whereas the size obtained for dZ is 10*10.

Is it correct? If yes, who do I multiply dA*dZ as they have different dimension.


Solution

  • You are almost there. However, you need to transpose dA, e.g. with numpy.transpose(dA). Then you will have the right dimensions of dA and dZ to perform matrix multiplication.