Search code examples
kerasconv-neural-networkmnist

ValueError when fitting keras model


I have the following code:

from sklearn.datasets import fetch_openml
import numpy as np
import keras

mnist = fetch_openml('mnist_784', version=1)   
X, y = mnist["data"], mnist["target"]

y = y.astype(np.uint8)

X_digits = [np.array(X.iloc[i]) for i in range(len(X))]
X = np.array([some_digit.reshape(28, 28) for some_digit in X_digits])

X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

model = keras.models.Sequential([
    keras.layers.Conv2D(64, 7, activation="relu", padding="same",
                        input_shape=[28, 28, 1]),
    keras.layers.MaxPooling2D(2),
    keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    keras.layers.MaxPooling2D(2),
    keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    keras.layers.MaxPooling2D(2),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation="softmax")
])

model.compile(loss="categorical_crossentropy")

That all seems to work fine. But then on this line:

model.fit(X_train, y_train)

I get this error:

ValueError                                Traceback (most recent call last)
<ipython-input-19-d768f88d541e> in <module>()
----> 1 model.fit(X_train, y_train)

1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
   1127           except Exception as e:  # pylint:disable=broad-except
   1128             if hasattr(e, "ag_error_metadata"):
-> 1129               raise e.ag_error_metadata.to_exception(e)
   1130             else:
   1131               raise

ValueError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 810, in train_step
        y, y_pred, sample_weight, regularization_losses=self.losses)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 141, in __call__
        losses = call_fn(y_true, y_pred)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 245, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 1665, in categorical_crossentropy
        y_true, y_pred, from_logits=from_logits, axis=axis)
    File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 4994, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (32, 1) and (32, 10) are incompatible

What is going wrong here?


Solution

  • As @Dr. Snoopy the shape of your labels is not correct. After you split your data into train and test should make sure, that your labels are poperly encode with the number of classes you want to have (in this case 10). Simply put this after your split and it should work:

    from tensorflow.keras.utils import to_categorical
    y_train = to_categorical(y_train, 10)
    y_test = to_categorical(y_test, 10)
    y_train.shape
    

    Output should be:

    (60000, 10)