machine-learning artificial-intelligence cross-entropy

How is cross entropy calculated for pixel level prediction

I'm running a FCN in Keras that uses the binary cross-entropy as the loss function. However, im not sure how the losses are accumulated.

I know that the loss gets applied at the pixel level, but then are the losses for each pixel in the image summed up to form a single loss per image? Or instead of being summed up, is it being averaged?

And furthermore, are the loss of each image simply summed(or is it some other operation) over the batch?

Solution

I assume that you question is a general one, and to specific to a particular model (if not can you share your model?).

You are right that if the cross-entropy is used at a pixel level, the results have to be reduced (summed or averaged) over all pixels to get a single value.

Here is an example of a convolutional autoencoder in tensorflow where this step is specific:

https://github.com/udacity/deep-learning/blob/master/autoencoder/Convolutional_Autoencoder_Solution.ipynb

The relevant lines are:

loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=targets_, logits=logits)
cost = tf.reduce_mean(loss)

Whether you take the mean or sum of the cost function does not change the value of the minimizer. But If you take the mean, then the value of the cost function is more easily comparable between experiments when you change the batch size or image size.