I'm trying to reproduce a neural network from http://neuralnetworksanddeeplearning.com/chap2.html
What i don't get is why they can calculate the gradient descent for the weights by taking a dot product of the error/delta and the transposed activations of the previous layer.
nabla_w[-1] = np.dot(delta, activations[-2].transpose())
delta
is a 1-dimensional array.
activations[-2]
is too. I thought if you transpose a 1 dimensional array you just get a 1-dimensional array..
So this dot product gives only a single number and not a matrix, which we want.
So how can this dot product give me a 2-dimensional matrix?
And is there a smart way to achieve this (calculate gradient descent for the weights) with numpy?
Calculating the dot product between two vectors i.e. your one dimensional arrays, is supposed to return a single scalar (value). Performing a cross product between two vectors will produce a new vector.
Therefore, it can't result in a matrix. Dot product doesn't produce matrices, only a scalar. np.dot() with two matrices as parameters will return the multplication of the matrices, but that is not the same as the dot product.