I am trying to port the implementation of concrete dropout in keras in https://github.com/yaringal/ConcreteDropout/blob/master/concrete-dropout-keras.ipynb to tensorflow 2. This is mostly straightforward, as tf 2 has most of the keras API built into it. However, the custom losses are being cleared before fitting.
After the model is defined, and before compiling it, I can see that the losses for each concrete dropout layer have been added to the model losses by the line self.layer.add_loss(regularizer)
run when the layers are built:
>>> print(model.losses)
[<tf.Tensor: id=64, shape=(), dtype=float32, numpy=-8.4521576e-05>, <tf.Tensor: id=168, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=272, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=376, shape=(), dtype=float32, numpy=-0.000650166>, <tf.Tensor: id=479, shape=(), dtype=float32, numpy=-0.000650166>]
After the compilation, however, model.losses
becomes an empty list, and the assertion assert len(model.losses) == 5
fails. If I choose to ignore the assertion, the fact that the layer losses are being neglected shows up in the warning WARNING:tensorflow:Gradients do not exist for variables ['concrete_dropout/p_logit:0', 'concrete_dropout_1/p_logit:0', 'concrete_dropout_2/p_logit:0', 'concrete_dropout_3/p_logit:0', 'concrete_dropout_4/p_logit:0'] when minimizing the loss.
when training the model.
After digging into the compilation code in https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/training.py#L184 I believe the problematic lines are
# Clear any `_eager_losses` that was added.
self._clear_losses()
Why is this being done at model compilation? And how can I add a loss per layer in tensorflow 2, if this is not the way to do it?
Since that custom loss is not dependent on the model's inputs, you should add it using a zero-argument callable like this:
self.layer.add_loss(lambda: regularizer)