python tensorflow keras recurrent-neural-network loss-function

What loss function can I use for a RNN sequence of multi-class classifications?

I am trying to predict a sequence of 18 multi-class probability vectors (14 exclusive classes) using a RNN, taking as input a 11-D numeric vector. The input numeric vector is the same throughout the sequence prediction.

To this end, I define this model in tensorflow:

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[1, 11]),
    keras.layers.SimpleRNN(20, return_sequences=True),
    keras.layers.SimpleRNN(14, return_sequences=True, activation="softmax",
                                                    kernel_initializer="glorot_uniform")])

model.compile(loss="CategoricalCrossentropy",
              optimizer="nadam")

When I ask to fit the model

history = model.fit(X_train, y_train, epochs=10, batch_size=32,
                    validation_data=(X_valid, y_valid))

I get the following error:

ValueError: Shapes (None, 18, 14) and (None, 1, 14) are incompatible

By contrast, no error is returned if I use "mse" as the loss function instead (although results are very poor).

For reference, X_train.shape is (18000, 1, 11), whereas y_train.shape is (18000, 18, 14).

Can you help me fix this error?

Thank you very much for your help,

Solution

The shapes produced by your network and the shape of your target must match.

In your case, your network must produce (None, 18, 14). A simple way to do that is using this structure:

X = np.random.uniform(0,1, (100,1,11))
Y = np.random.randint(0,1, (100,18,14))

model = Sequential([
    SimpleRNN(20, return_sequences=False, input_shape=[1, 11]),
    RepeatVector(18),
    SimpleRNN(14, return_sequences=True, activation="softmax")])

model.compile(loss="CategoricalCrossentropy", optimizer="nadam")
history = model.fit(X, Y, epochs=10)

Where we set return_sequences=False and place after a RepeatVector.