Error:
return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
C:\Users\selvaa\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\keras\backend.py:4619 categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
C:\Users\selvaa\miniconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\tensor_shape.py:1128 assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (None, 1) and (None, 151) are incompatible
My model:
x = np.array(x)
y = np.array(y)
x = x/255.0
model = Sequential()
model.add(Conv2D(3, (3,3), input_shape=(128,128,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(302, activation='relu'))
model.add(Dense(151, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y, batch_size=32, epochs=5, verbose=1, validation_split=0.1)
I'm trying to train a model to identify different pokemon, I have two pictures of each 151 pokemon for my datasets (correctly labeled and all). Not sure what I'm doing wrong.
Here is what happens when I print x.shape and y.shape:
(301, 128, 128, 3) (301,)
You should use the loss tf.keras.losses.SparseCategoricalCrossentropy
, as shown in the code example below.
The loss function tf.keras.losses.SparseCategoricalCrossentropy
accepts reference labels in the shape (n_samples,)
and predicted labels in the shape (n_samples, n_classes)
, which would work for your data. You cannot use categorical_crossentropy
because that expects your labels to be one-hot encoded (see bottom of answer).
x = np.array(x)
y = np.array(y)
x = x / 255.0
model = Sequential()
model.add(Conv2D(3, (3,3), input_shape=(128,128,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(302, activation='relu'))
model.add(Dense(151, activation='softmax'))
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer='adam',
metrics=['accuracy'])
model.fit(x, y, batch_size=32, epochs=5, verbose=1, validation_split=0.1)
Another solution is to one-hot encode your labels prior to training, for example using the function tf.one_hot
. If you use this approach, then you can use categorical_crossentropy
.