For classification task there are several loss-function we can use. If I simply use something like
model.compile{ loss=keras.losses.categorical_crossentropy, ....
Does this mean loss is normalized in batch size? like
loss = 1/m * sum_i sum_c [ y_i^c * ln(y^_i)^c ]
wehre m is batch size, and
i is an index of a sample
c is class
or the loss is summed up in the batch size?
loss = sum_i sum_c [ y_i^c * ln(y^_i)^c ]
What I can find from the keras API docs is that the default reduction for batch optimization is set to AUTO
which defaults "for almost all cases" to SUM_OVER_BATCH_SIZE
. This will mean the loss is the scalar sum divided by number of elements in batch.