I'm training this simple network with a few points, but it can't train. The model looks okay, but when training it raises a ValueError about the dimension of the training data sets. Could someone help?
The code:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LocallyConnected1D
# Generate random training data
X_train = np.random.rand(1000, 100, 1)
y_train = np.random.randint(0, 10, size=(1000, 1))
# Generate random testing data
X_test = np.random.rand(100, 100, 1)
y_test = np.random.randint(0, 10, size=(100, 1))
# Create a sequential model
model = Sequential()
# Add a LocallyConnected1D layer with 32 filters, a kernel size of 3, and a ReLU activation function
model.add(LocallyConnected1D(32, 3, activation='relu', input_shape=(100, 1)))
# Add a dense layer with 10 units and a softmax activation function
model.add(Dense(10, activation='softmax'))
# Compile the model with categorical cross-entropy loss and Adam optimizer
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train the model on the training data
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))
# Evaluate the model on the testing data
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test loss:', test_loss)
print('Test accuracy:', test_acc)
and the ValueError raised is:
ValueError: in user code:
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\training.py", line 1160, in train_function *
return step_function(self, iterator)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\training.py", line 1146, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\training.py", line 1135, in run_step **
outputs = model.train_step(data)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\training.py", line 994, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\training.py", line 1052, in compute_loss
return self.compiled_loss(
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\engine\compile_utils.py", line 265, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\losses.py", line 152, in __call__
losses = call_fn(y_true, y_pred)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\losses.py", line 272, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\losses.py", line 1990, in categorical_crossentropy
return backend.categorical_crossentropy(
File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\keras\backend.py", line 5529, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (None, 1) and (None, 98, 10) are incompatible
Use sparse_categorical_crossentropy
as your loss function and add for example a Flatten
layer before adding your output layer. Here is a working example:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LocallyConnected1D, Flatten
# Generate random training data
X_train = np.random.rand(1000, 100, 1)
y_train = np.random.randint(0, 10, size=(1000, 1))
# Generate random testing data
X_test = np.random.rand(100, 100, 1)
y_test = np.random.randint(0, 10, size=(100, 1))
# Create a sequential model
model = Sequential()
# Add a LocallyConnected1D layer with 32 filters, a kernel size of 3, and a ReLU activation function
model.add(LocallyConnected1D(32, 3, activation='relu', input_shape=(100, 1)))
# Add a dense layer with 10 units and a softmax activation function
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
# Compile the model with categorical cross-entropy loss and Adam optimizer
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train the model on the training data
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))
# Evaluate the model on the testing data
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test loss:', test_loss)
print('Test accuracy:', test_acc)
If you want to use categorical_crossentropy
, your labels have to be one-hot encoded. See this for more information. Also note that the LocallyConnected1D
layer has a 3D output, not 2D. See also this.