Search code examples
tensorflowkerasdeep-learningneural-networkloss-function

Custom categorical loss function for a variable number of labels


How can I build a custom loss function for a variable number of labels:

y_true = np.array([[0., 1., -1., -1.],
                   [1., 0., 1., -1.]])

y_pred = np.array([[0.3, 0.5, 0.2, 0.1],
                   [0.4, 0.3, 0.7, 0.5]])

In this example, we have a batch size equal to 2 and a variable number of labels (with max length of 4) for each sample. -1 values in y_true is a mask and I would like to calculate a binary crossentropy between other values ([[0., 1.], [1., 0., 1.]] and [[0.3, 0.5], [0.4, 0.3, 0.7]]). Thank you!

I can only create such a function for 1d tensor:

def my_loss_fn(y_true, y_pred):
    bools = tf.math.not_equal(y_true, tf.constant(-1.))
    n = tf.math.count_nonzero(bools)
    y_true = y_true[:n]
    y_pred = y_pred[:n]
    bce = tf.keras.losses.BinaryCrossentropy()
    return bce(y_true, y_pred)

Solution

  • You can just compute your loss point wise (pass reduction=none) and use your matching of -1 as a mask.

    def my_loss_fn(y_true, y_pred):
        mask = tf.cast(tf.math.not_equal(y_true, tf.constant(-1.)), tf.float32)
        y_true, y_pred = tf.expand_dims(y_true, axis=-1), tf.expand_dims(y_pred, axis=-1)
        bce = tf.keras.losses.BinaryCrossentropy(reduction='none')
        return tf.reduce_sum(tf.cast(bce(y_true, y_pred), tf.float32) * mask) / tf.reduce_sum(mask)
    

    Depending how you want to average you might want to do per row sum of mask to normalise each loss separately etc.