Search code examples
kerascallbackloss

How to implement an adaptive loss in Keras?


I am trying to use Keras to implement the work done in A General and Adaptive Robust Loss Function. The author provides tensorflow code that works the hard details. I am just trying to use his prebuilt function in Keras.

His custom loss function is learning a parameter 'alpha' that controls the shape of the loss function. I would like to track 'alpha' in addition to the loss during training.

I am somewhat familiar with Keras custom loss functions and using wrappers, but I am not entirely sure how to use callbacks to track 'alpha'. Below is how I would choose to naively construct the loss function in Keras. However I am not sure how I would then access the 'alpha' to track.

From the provided tensorflow code, the function lossfun(x) returns a tuple.

def lossfun(x,
            alpha_lo=0.001,
            alpha_hi=1.999,
            alpha_init=None,
            scale_lo=1e-5,
            scale_init=1.,
            **kwargs):
    """
    Returns:
        A tuple of the form (`loss`, `alpha`, `scale`).
    """
def customAdaptiveLoss(): 
    def wrappedloss(y_true,y_pred):
        loss, alpha, scale = lossfun((y_true-y_pred))  #Author's function
        return loss
    return wrappedloss

Model.compile(optimizer = optimizers.Adam(0.001),
                        loss = customAdaptiveLoss,)

Again, what I am hoping to do is track the variable 'alpha' during training.


Solution

  • The following example displays alpha as a metric. Tested in colab.

    %%
    !git clone https://github.com/google-research/google-research.git
    
    %%
    import sys
    sys.path.append('google-research')
    from robust_loss.adaptive import lossfun
    
    # the robust_loss impl depends on the current workdir to load a data file.
    import os
    os.chdir('google-research')
    
    import numpy as np
    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.keras.layers import *
    from tensorflow.keras.models import Model
    from tensorflow.keras import backend as K
    
    class RobustAdaptativeLoss(object):
      def __init__(self):
        z = np.array([[0]])
        self.v_alpha = K.variable(z)
    
      def loss(self, y_true, y_pred, **kwargs):
        x = y_true - y_pred
        x = K.reshape(x, shape=(-1, 1))
        with tf.variable_scope("lossfun", reuse=True):
          loss, alpha, scale = lossfun(x)
        op = K.update(self.v_alpha, alpha)
        # The alpha update must be part of the graph but it should
        # not influence the result.
        return loss + 0 * op
    
      def alpha(self, y_true, y_pred):
        return self.v_alpha
    
    def make_model():
      inp = Input(shape=(3,))
      out = Dense(1, use_bias=False)(inp)
      model = Model(inp, out)
      loss = RobustAdaptativeLoss()
      model.compile('adam', loss.loss, metrics=[loss.alpha])
      return model
    
    model = make_model()
    model.summary()
    
    init_op = tf.global_variables_initializer()
    K.get_session().run(init_op)
    
    import numpy as np
    
    FACTORS = np.array([0.5, 2.0, 5.0])
    def target_fn(x):
      return np.dot(x, FACTORS.T)
    
    N_SAMPLES=100
    X = np.random.rand(N_SAMPLES, 3)
    Y = np.apply_along_axis(target_fn, 1, X)
    
    history = model.fit(X, Y, epochs=2, verbose=True)
    print('final loss:', history.history['loss'][-1])