Search code examples
pythontensorflowkerasloss-function

Keras custom loss function with different weights per example


I'm trying to implement in Keras a custom loss function where each individual example (not class) has a different weight.

To be precise, given the usual y_true (eg. <1,1,0>) and y_pred (e.g. <1,0.2,0.8>), I'm trying to create weights (e.g. <0.81, 0.9, 1.0>) and use these with the binary_crossentropy loss function. I have tried:

import numpy as np
from keras import backend as K

def my_binary_crossentropy(y_true, y_pred):
    base_factor = 0.9
    num_examples = K.int_shape(y_true)[0]

    out = [ K.pow(base_factor, num_examples - i - 1) for i in range(num_examples) ]
    forgetting_factors = K.stack(out)

    return K.mean(
        forgetting_factors * K.binary_crossentropy(y_true, y_pred),
        axis=-1
    )

And works fine with this simple example:

y_true = K.variable( np.array([1,1,0]) )
y_pred = K.variable( np.array([1,0.2,0.8]) )
print K.eval(my_binary_crossentropy(y_true, y_pred))

However, when I use it with model.compile(loss=my_binary_crossentropy, ...) I get the following error: TypeError: range() integer end argument expected, got NoneType.

I have tried a few things. I replaced K.int_shape with K_shape and now getting: TypeError: range() integer end argument expected, got Tensor. I further replaced range() with K.arange() and now getting: TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.

Can anyone help me please? What am I missing? Many Thanks!


Solution

  • K.pow can take a sequence of exponents as argument. So you can compute the exponents first, as a tensor ([num_examples - 1, num_examples - 2, ..., 0]), and then feed this tensor into K.pow. Here num_examples is basically just K.shape(y_pred)[0], which is also a tensor.

    def my_binary_crossentropy(y_true, y_pred):
        base_factor = 0.9
        num_examples = K.cast(K.shape(y_pred)[0], K.floatx())
        exponents = num_examples - K.arange(num_examples) - 1
        forgetting_factors = K.pow(base_factor, exponents)
        forgetting_factors = K.expand_dims(forgetting_factors, axis=-1)
        forgetting_factors = K.print_tensor(forgetting_factors)  # only for debugging
    
        loss = K.mean(
            forgetting_factors * K.binary_crossentropy(y_true, y_pred),
            axis=-1
        )
        loss = K.print_tensor(loss)  # only for debugging
        return loss
    

    As an example, the output printed by the two K.print_tensor statements would be like:

    model = Sequential()
    model.add(Dense(1, activation='sigmoid', input_shape=(100,)))
    model.compile(loss=my_binary_crossentropy, optimizer='adam')
    
    model.evaluate(np.zeros((3, 100)), np.ones(3), verbose=0)
    [[0.809999943][0.9][1]]
    [0.56144917 0.623832464 0.693147182]
    
    model.evaluate(np.zeros((6, 100)), np.ones(6), verbose=0)
    [[0.590489924][0.656099916][0.728999913]...]
    [0.409296423 0.454773813 0.505304217...]
    

    The numbers are not exact due to rounding errors. The forgetting_factors (first lines printed after model.evaluate) are indeed the powers of 0.9. You can also verify that the returned loss values decay by a factor of 0.9 (0.623832464 = 0.693147182 * 0.9 and 0.56144917 = 0.693147182 * 0.9 ** 2, etc).