Search code examples
python-3.xkerasneural-networkruntime-errorchess

tensorflow Keras: Dimenions must be equal ValueError


I'm trying to train a model in Keras to suggest the best possible next move when presented with a pawn chess board. the board is represented as a list of 64 integers (0 for empty, 1 for player, 2 for enemy). The output is represented by a list of a field and a direction that the figure on that field should move in, which means I need two ouput layers with size 64 (number of fields) and 5 (number of possible move directions, including two forward and no move for when the game is over). I have a list of boards and a list of solutions. When I try to fit the model however, I get the above mentioned error.

The exact error message is:

Epoch 1/75
Traceback (most recent call last):
  File "C:\Users\lulll\Documents\CodeStuff\tfTesting\main.py", line 75, in <module>
    model.fit(train_fig_starts, train_fig_moves, epochs=75)
  File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\lulll\AppData\Local\Temp\__autograph_generated_filej0zia4d5.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1249, in train_function  *
        return step_function(self, iterator)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1233, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1222, in run_step  **
        outputs = model.train_step(data)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1024, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\training.py", line 1082, in compute_loss
        return self.compiled_loss(
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\engine\compile_utils.py", line 265, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 152, in __call__
        losses = call_fn(y_true, y_pred)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 284, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\losses.py", line 2176, in binary_crossentropy
        backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
    File "C:\Users\lulll\Documents\CodeStuff\tfTesting\venv\lib\site-packages\keras\backend.py", line 5688, in binary_crossentropy
        bce = target * tf.math.log(output + epsilon())

    ValueError: Dimensions must be equal, but are 2 and 64 for '{{node binary_crossentropy/mul}} = Mul[T=DT_FLOAT](binary_crossentropy/Cast, binary_crossentropy/Log)' with input shapes: [?,2], [?,64].

I have absolutely no idea what is causing this. I've searched for the error already, but the only mentions I've found seem to be describing a completely different scenario. Since it probably helps, here's the code used to create and fit the model:

inputs = tf.keras.layers.Input(shape=64)
x = tf.keras.layers.Dense(32, activation='relu')(inputs)
out_field = tf.keras.layers.Dense(64, name="field")(x)
out_movement = tf.keras.layers.Dense(5, name="movement")(x)

model = tf.keras.Model(inputs=inputs, outputs=[out_field, out_movement])

model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=['accuracy'])

model.fit(train_fig_starts, train_fig_moves, epochs=75) #train_fig_starts and moves are defined above

EDIT 1: Here's a sample of the dataset I'm using (the whole thing is too long for the character limit)

train_fig_starts = [[0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 2, 0, 1, 0, 0, 0, 0, 1, 2, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 2, 1, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 0, 0, 1], [0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0], [0, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 2, 2, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 2, 2, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 1, 2, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]]
train_fig_moves = [[0, 0], [0, 0], [0, 0], [0, 0], [15, 2], [15, 2]]

EDIT 2: I changed it to sparsecategorialcrossentropy since that seems more like what I'm looking for. This is now the model code

inputs = tf.keras.layers.Input(shape=64)
x = tf.keras.layers.Dense(64, activation='relu')(inputs)

out_field = tf.keras.layers.Dense(64, activation="relu",  name="field")(x)
out_field = tf.keras.layers.Dense(64, activation="softmax", name="field_softmax")(out_field)

out_movement = tf.keras.layers.Dense(5, activation="relu", name="movement")(x)
out_movement = tf.keras.layers.Dense(5, activation="softmax", name="movement_softmax")(out_movement)

model = tf.keras.Model(inputs=inputs, outputs=[out_field, out_movement])

print(model.summary())
tf.keras.utils.plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

model.compile(optimizer='adam',
              loss=[tf.keras.losses.SparseCategoricalCrossentropy(),
                    tf.keras.losses.SparseCategoricalCrossentropy()],
              metrics=['accuracy'])

it still throws an error, this time its the following:

Node: 'sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
logits and labels must have the same first dimension, got logits shape [32,5] and labels shape [64]
     [[{{node sparse_categorical_crossentropy_1/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_1666]

I have no idea why its like that. Output logits and labels should both be [64, 2]. Since I'm using sparse crossentropy I should be able to use integers in my training data to signify the "index" of the ouput neuron with the highest logit, right? Correct me if I'm wrong. If it helps, here's a diagram of my model: plot of the model


Solution

  • So I fixed the issue by myself now. Honestly it was a pretty stupid error to make but the error messages didn't really explain well what was going on. I swapped the outputs for one hot encoding and changed the loss to CategorialCrossEntropy, which is also more fitting for a categorisation problem (Sparse didn't work with my integers for some reason). After that I needed to change the label list from a 1dim list containing lists of len = 2 to a 2dim list containing both the field and the move one hots in a separate list. If anyone runs into a similar issue and can't make sense of it, maybe this will help.