Search code examples
pythontensorflowmachine-learningkerasloss-function

Keras custom loss function - shape mismatch despite returning same shape as categorical crossentropy


I've created a custom loss function based on cosine:

def cos_loss(y_true, y_pred):
    norm_pred = tf.math.l2_normalize(y_pred)
    dprod = tf.tensordot(
        a=y_true,
        b=norm_pred,
        axes=1
    )
    return 1 - dprod

However, training a model with this custom loss results in the error In[0] mismatch In[1] shape: 2 vs. 8: [8,2] [8,2] 0 0. If I use a built-in loss function like categorical cross-entropy, the model trains without issue.

This is despite my custom loss and categorical crossentropy returning values that are exactly the same type and shape. For example, I create testing y_true and y_pred and run them through both :

test_true = np.asarray([1.0, 0.0])
test_pred = np.asarray([0.9, 0.2])
print(cos_loss(test_true, test_pred))
print(tf.keras.losses.categorical_crossentropy(test_true, test_pred))

which returns:

> tf.Tensor(0.023812939816047263, shape=(), dtype=float64)
  tf.Tensor(0.20067069546215124, shape=(), dtype=float64)

So both give TF tensors with a single float-64 value and no shape. So why am I getting a shape mismatch error on one but not the other if the shape outputs are the same please? Thanks.


Solution

  • Your loss function should be able to take in a batch of predictions and ground truth and return a batch of loss values. At the moment, that's not the case, as a tensordot with axis=1 is a matrix multiplication, and you have a conflict of dimensions when you start to introduce a batch dimension.

    You can probably use the following instead:

    def cos_loss(y_true, y_pred):
        norm_pred = tf.math.l2_normalize(y_pred)
        dprod = tf.reduce_sum(y_true*norm_pred, axis=-1)
        return 1 - dprod