Search code examples
pythontensorflowkeras

Getting a ValueError in tensorflow saying that my shapes are incompatible


Error:

return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
    C:\Users\selvaa\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\backend.py:4619 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    C:\Users\selvaa\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\tensor_shape.py:1128 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (None, 1) and (None, 151) are incompatible

My model:

x = np.array(x)
y = np.array(y)

x = x/255.0

model = Sequential()
model.add(Conv2D(3, (3,3), input_shape=(128,128,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(302, activation='relu'))
model.add(Dense(151, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y, batch_size=32, epochs=5, verbose=1, validation_split=0.1)

I'm trying to train a model to identify different pokemon, I have two pictures of each 151 pokemon for my datasets (correctly labeled and all). Not sure what I'm doing wrong.

Here is what happens when I print x.shape and y.shape:

(301, 128, 128, 3) (301,)

Solution

  • You should use the loss tf.keras.losses.SparseCategoricalCrossentropy, as shown in the code example below.

    The loss function tf.keras.losses.SparseCategoricalCrossentropy accepts reference labels in the shape (n_samples,) and predicted labels in the shape (n_samples, n_classes), which would work for your data. You cannot use categorical_crossentropy because that expects your labels to be one-hot encoded (see bottom of answer).

    x = np.array(x)
    y = np.array(y)
    
    x = x / 255.0
    
    model = Sequential()
    model.add(Conv2D(3, (3,3), input_shape=(128,128,3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Flatten())
    model.add(Dense(302, activation='relu'))
    model.add(Dense(151, activation='softmax'))
    
    model.compile(
        loss=tf.keras.losses.SparseCategoricalCrossentropy(), 
        optimizer='adam', 
        metrics=['accuracy'])
    
    model.fit(x, y, batch_size=32, epochs=5, verbose=1, validation_split=0.1)
    

    Another solution is to one-hot encode your labels prior to training, for example using the function tf.one_hot. If you use this approach, then you can use categorical_crossentropy.