Search code examples
neural-networkbackpropagationloss-function

Use of loss functions in updating weights of neural networks


I am confused about the use of tf.reduce_mean(tf.nn.l2_loss(prediction - output)) line.

While doing back propagation, the output should be a vector as each output neuron's predicted output is subtracted from the actual output and this is repeated for all output neurons, hence we get a vector of size (n,1). If we use tf.reduce_mean(tf.nn.l2_loss(prediction - output)), the output is a single value. I am unable to understand how this single value will be propagated to update the weights. Shouldn't it be always a vector?


Solution

  • Your are correct that the loss must be a vector. However, This line

    tf.reduce_mean
    

    is taking the average loss over batches, not over the vector. See https://stats.stackexchange.com/questions/201452/is-it-common-practice-to-minimize-the-mean-loss-over-the-batches-instead-of-the