Search code examples
pythontensorflowkerasconv-neural-network

TensorFlow - ValueError: Shapes (None, 1) and (None, 10) are incompatible


I am trying to implement an image classifier using "The Street View House Numbers (SVHN) Dataset" from this link. I am using format 2 which contains 32x32 RGB centered digit images from 0 to 9. When I try to compile and fit the model I get the following error:

Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-37-31870b6986af> in <module>()
      3 
      4 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
----> 5 model.fit(trainX, trainY, validation_data=(validX, validY), batch_size=128, epochs=10)

9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    975           except Exception as e:  # pylint:disable=broad-except
    976             if hasattr(e, "ag_error_metadata"):
--> 977               raise e.ag_error_metadata.to_exception(e)
    978             else:
    979               raise

ValueError: in user code:

    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:795 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:1259 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:788 run_step  **
        outputs = model.train_step(data)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:756 train_step
        y, y_pred, sample_weight, regularization_losses=self.losses)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/compile_utils.py:203 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/losses.py:152 __call__
        losses = call_fn(y_true, y_pred)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/losses.py:256 call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/losses.py:1537 categorical_crossentropy
        return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/backend.py:4833 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_shape.py:1134 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (None, 1) and (None, 10) are incompatible

The code is:

model = Sequential([
                    Conv2D(filters=64, kernel_size=3, strides=2, activation='relu', input_shape=(32,32,3)),
                    MaxPooling2D(pool_size=(2, 2), strides=1, padding='same'),
                    Conv2D(filters=32, kernel_size=3, strides=1, activation='relu'),
                    MaxPooling2D(pool_size=(2, 2), strides=1, padding='same'),
                    Flatten(),
                    Dense(10, activation='softmax')
])
model.summary()

Model: "sequential_10"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_23 (Conv2D)           (None, 15, 15, 64)        1792      
_________________________________________________________________
max_pooling2d_23 (MaxPooling (None, 15, 15, 64)        0         
_________________________________________________________________
conv2d_24 (Conv2D)           (None, 13, 13, 32)        18464     
_________________________________________________________________
max_pooling2d_24 (MaxPooling (None, 13, 13, 32)        0         
_________________________________________________________________
flatten_10 (Flatten)         (None, 5408)              0         
_________________________________________________________________
dense_13 (Dense)             (None, 10)                54090     
=================================================================
Total params: 74,346
Trainable params: 74,346
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(trainX, trainY, validation_data=(validX, validY), batch_size=128, epochs=10)

I was unable to solve the error, does anyone have any ideas on how to fix it?


Solution

  • As i could not see your coding for trainY; seems like - your trainY has only one column and your model output have 10 neurons, so Shapes (None, 1) and (None, 10) are incompatible. you can try this on your trainY(i.e one-hot encoding)

    from sklearn.preprocessing import LabelBinarizer
    label_as_binary = LabelBinarizer()
    train__y_labels = label_as_binary.fit_transform(trainY)
    

    and compile will look like as (look for train__y_labels)

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(train_X_input, train__y_labels, batch_size=128, epochs=1)
    

    note: if your valid also throws the error, same would be needed on all y(s).