Search code examples
tensorflowmachine-learningtensorflow-estimator

Tensorflow estimator: average_loss vs loss


In tf.estimator, what's the difference between average_loss and loss? I would have guessed from the names that the former would be the latter divided by the number of records, but that's not the case; with a few thousand records, the latter is about three or four times the former.


Solution

  • The difference between average_loss and loss is that one reduces the SUM over the batch losses, while the other reduces the MEAN over the same losses. Hence, the ratio is exactly the batch_size argument of your input_fn. If you pass batch_size=1, you should see them equal.

    The actual reported tensors depend on the particular type of tf.Estimator, but they are very similar, here's the source code for the regression head (corresponds to tf.DNNRegressor):

    training_loss = losses.compute_weighted_loss(unweighted_loss, weights=weights,
                                                 reduction=losses.Reduction.SUM)
    
    mean_loss = metrics_lib.mean(unweighted_loss, weights=weights)
    

    As you can see, they are computed from the same unweighted_loss and weights tensors. The same values are reported to tensorboard summary.