I am trying to predict a sequence of 18 multi-class probability vectors (14 exclusive classes) using a RNN, taking as input a 11-D numeric vector. The input numeric vector is the same throughout the sequence prediction.
To this end, I define this model in tensorflow:
model = keras.models.Sequential([
keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[1, 11]),
keras.layers.SimpleRNN(20, return_sequences=True),
keras.layers.SimpleRNN(14, return_sequences=True, activation="softmax",
kernel_initializer="glorot_uniform")])
model.compile(loss="CategoricalCrossentropy",
optimizer="nadam")
When I ask to fit the model
history = model.fit(X_train, y_train, epochs=10, batch_size=32,
validation_data=(X_valid, y_valid))
I get the following error:
ValueError: Shapes (None, 18, 14) and (None, 1, 14) are incompatible
By contrast, no error is returned if I use "mse" as the loss function instead (although results are very poor).
For reference, X_train.shape is (18000, 1, 11), whereas y_train.shape is (18000, 18, 14).
Can you help me fix this error?
Thank you very much for your help,
The shapes produced by your network and the shape of your target must match.
In your case, your network must produce (None, 18, 14)
. A simple way to do that is using this structure:
X = np.random.uniform(0,1, (100,1,11))
Y = np.random.randint(0,1, (100,18,14))
model = Sequential([
SimpleRNN(20, return_sequences=False, input_shape=[1, 11]),
RepeatVector(18),
SimpleRNN(14, return_sequences=True, activation="softmax")])
model.compile(loss="CategoricalCrossentropy", optimizer="nadam")
history = model.fit(X, Y, epochs=10)
Where we set return_sequences=False
and place after a RepeatVector
.