Search code examples
pythonnumpymachine-learningneural-networkbackpropagation

How to calculate gradient descent cost for the weights using a dot product?


I'm trying to reproduce a neural network from http://neuralnetworksanddeeplearning.com/chap2.html

What i don't get is why they can calculate the gradient descent for the weights by taking a dot product of the error/delta and the transposed activations of the previous layer.

nabla_w[-1] = np.dot(delta, activations[-2].transpose())

delta is a 1-dimensional array. activations[-2] is too. I thought if you transpose a 1 dimensional array you just get a 1-dimensional array.. So this dot product gives only a single number and not a matrix, which we want.

So how can this dot product give me a 2-dimensional matrix?

And is there a smart way to achieve this (calculate gradient descent for the weights) with numpy?


Solution

  • Calculating the dot product between two vectors i.e. your one dimensional arrays, is supposed to return a single scalar (value). Performing a cross product between two vectors will produce a new vector.

    Therefore, it can't result in a matrix. Dot product doesn't produce matrices, only a scalar. np.dot() with two matrices as parameters will return the multplication of the matrices, but that is not the same as the dot product.