When computing the delta values for a neural network after running back propagation :
the value of delta(1) will be a scalar value, it should be a vector ?
Update :
Taken from http://www.holehouse.org/mlclass/09_Neural_Networks_Learning.html
First, you probably understand that in each layer, we have n x m
parameters (or weights) that needs to be learned so it forms a 2-d matrix.
n is the number of nodes in the current layer plus 1 (for bias)
m is the number of nodes in the previous layer.
We have n x m
parameters because there is one connection between any of the two nodes between the previous and the current layer.
I am pretty sure that Delta (big delta) at layer L is used to accumulate partial derivative terms for every parameter at layer L. So you have a 2D matrix of Delta at each layer as well. To update the i-th row (the i-th node in the current layer) and j-th column (the j-th node in the previous layer) of the matrix,
D_(i,j) = D_(i,j) + a_j * delta_i
note a_j is the activation from the j-th node in previous layer,
delta_i is the error of the i-th node of the current layer
so we accumulate the error proportional to their activation weight.
Thus to answer your question, Delta should be a matrix.