Search code examples
pythontensorflowmachine-learningkerasdeep-learning

Incompatible shapes when training Keras with custom loss function


When running the code below I am receiving incompatible shapes from Keras. I have seen several similar questions regarding custom loss functions but none with incompatible shapes. Is this issue arising from my custom loss itself or something deeper in Keras?

tensorflow==2.13.0

import numpy as np
import pandas as pd
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

N = 1000
df = pd.DataFrame({
    'Feature1': np.random.normal(loc=0, scale=1, size=N),
    'Feature2': np.random.normal(loc=1, scale=2, size=N),
    'Label': np.random.choice([0, 1], size=N)
})

df_train = df.sample(frac = 0.80, random_state = 42)
df_test = df[~df.index.isin(df_train.index)]
print(f"df_train.shape = {df_train.shape}")
print(f"df_test.shape = {df_test.shape}")

X_train, y_train = df_train[['Feature1', 'Feature2']], df_train['Label']
X_test, y_test = df_test[['Feature1', 'Feature2']], df_test['Label']

def my_loss(data, y_pred):
    y_true = data[:, 0]
    amount = data[:, 1]
    amount_true = amount * y_true
    amount_pred = amount * y_pred
    error = amount_pred - amount_true
    return sum(error)

y_train_plus_amt = np.append(y_train.values.reshape(-1, 1),
    X_train['Feature1'].values.reshape(-1, 1), axis = 1)

M = Sequential()
M.add(Dense(16, input_shape=(X_train.shape[1],), activation = 'relu'))
M.compile(optimizer='adam', loss = my_loss, run_eagerly = True)
M.fit(X_train, y_train_plus_amt, epochs=10, batch_size=64)


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/venv/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "<stdin>", line 5, in my_loss
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__Mul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [64] vs. [64,16] [Op:Mul] name: 

Solution

  • The section of your loss function

    amount_pred = amount * y_pred
    

    is attempting to perform a matrix multiplication of matrices with sizes (64, 1) and (64, 16). This is not possible.

    For matrix multiplication to be defined, it is required that both matrices have compatible types. That is to say that the two matrices must have sizes (m, n) and (n, q), for some m, n, q. Your matrix sizes do not satisfy this condition, and so multiplication is quite literally not defined for them (in the traditional sense).