tensorflow keras deep-learning keras-layer tf.keras

Adding custom metric Keras Subclassing API

I'm following the section "Losses and Metrics Based on Model Internals" on chapter 12 of "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition - Aurélien Geron", in which he shows how to add custom losses and metrics that do not depend on labels and predictions.

To illustrate this, we add a custom "reconstruction loss" by adding a layer on top of the upper hidden layer which should reproduce the input. The loss is the mean squared difference betweeen the reconstruction loss and the inputs.

He shows the code for adding the custom loss, which works nicely, but even following his description I cannot make add the metric, since it raises `ValueError". He says:

Similarly, you can add a custom metric based on model internals by computing it in any way you want, as long as the result is the output of a metric object. For example, you can create a keras.metrics.Mean object in the constructor, then call it in the call() method, passing it the recon_loss, and finally add it to the model by calling the model’s add_metric() method.

This is the code(I have added #MINE for the lines I have added myself)

import tensorflow as tf
from tensorflow import keras
class ReconstructingRegressor(keras.models.Model):
    def __init__(self, output_dim, **kwargs):
        super().__init__(**kwargs)
        self.hidden = [keras.layers.Dense(30, activation="selu",
                                          kernel_initializer="lecun_normal")
                       for _ in range(5)]
        self.out = keras.layers.Dense(output_dim)
        self.reconstruction_mean = keras.metrics.Mean(name="reconstruction_error") #MINE

    def build(self, batch_input_shape):
        n_inputs = batch_input_shape[-1]
        self.reconstruct = keras.layers.Dense(n_inputs)
        super().build(batch_input_shape)

    def call(self, inputs, training=None):
        Z = inputs
        for layer in self.hidden:
            Z = layer(Z)
        reconstruction = self.reconstruct(Z)
        recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
        self.add_loss(0.05 * recon_loss)
        if training:                                      #MINE
            result = self.reconstruction_mean(recon_loss) #MINE
        else:                                             #MINE
            result = 0.                                   #MINE, I have also tried different things here, 
                                                          #but the help showed a similar sample to this.
        self.add_metric(result, name="foo")               #MINE
        return self.out(Z)

Then compiling and fitting the model:

training_set_size=10
X_dummy = np.random.randn(training_set_size, 8) 
y_dummy = np.random.randn(training_set_size, 1)

model = ReconstructingRegressor(1)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2)

Which throws:


ValueError: in converted code:

    <ipython-input-296-878bdeb30546>:26 call  *
        self.add_metric(result, name="foo")               #MINE
    C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1147 add_metric
        self._symbolic_add_metric(value, aggregation, name)
    C:\Users\Kique\Anaconda3\envs\piz3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py:1867 _symbolic_add_metric
        'We do not support adding an aggregated metric result tensor that '

    ValueError: We do not support adding an aggregated metric result tensor that is not the output of a `tf.keras.metrics.Metric` metric instance. Without having access to the metric instance we cannot reset the state of a metric after every epoch during training. You can create a `tf.keras.metrics.Metric` instance and pass the result here or pass an un-aggregated result with `aggregation` parameter set as `mean`. For example: `self.add_metric(tf.reduce_sum(inputs), name='mean_activation', aggregation='mean')`

Having read that, I tried similar things to solve that issue but it just led to different errors. How can I solve this? What is the "correct" way to do this?

I'm using conda on Windows, with tensorflow-gpu 2.1.0 installed.

Solution

The problem is just right here:

def call(self, inputs, training=None):
    Z = inputs
    for layer in self.hidden:
        Z = layer(Z)
    reconstruction = self.reconstruct(Z)
    recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
    self.add_loss(0.05 * recon_loss)
    if training:                                      
        result = self.reconstruction_mean(recon_loss) 
    else:                                             
        result = 0.#<---Here!                                          
    self.add_metric(result, name="foo")              
    return self.out(Z)

The error says that add_metric only gets a metric derived from tf.keras.metrics.Metric but 0 is a scalar, not a metric type.

My proposed solution is to simply do that:

def call(self, inputs, training=None):
    Z = inputs
    for layer in self.hidden:
        Z = layer(Z)
    reconstruction = self.reconstruct(Z)
    recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
    self.add_loss(0.05 * recon_loss)
    if training:                                      
        result = self.reconstruction_mean(recon_loss)                           
        self.add_metric(result, name="foo")              
    return self.out(Z)

This way, your mean reconstruction_error will be shown only in training time.

Since you work with eager mode, you should create your layer with dynamic=True as below:

model = ReconstructingRegressor(1,dynamic=True)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X_dummy, y_dummy, epochs=2, batch_size=10)

P.S - pay attention, that when calling model.fit or model.evaluate you should also make sure that the batch size divides your train set (since this is a stateful network). So, call those function like this: model.fit(X_dummy, y_dummy, epochs=2, batch_size=10) or model.evaluate(X_dummy,y_dummy, batch_size=10). Good Luck!