Search code examples
pythontensorflowkerasrecurrent-neural-networkloss-function

What loss function can I use for a RNN sequence of multi-class classifications?


I am trying to predict a sequence of 18 multi-class probability vectors (14 exclusive classes) using a RNN, taking as input a 11-D numeric vector. The input numeric vector is the same throughout the sequence prediction.

To this end, I define this model in tensorflow:

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[1, 11]),
    keras.layers.SimpleRNN(20, return_sequences=True),
    keras.layers.SimpleRNN(14, return_sequences=True, activation="softmax",
                                                    kernel_initializer="glorot_uniform")])

model.compile(loss="CategoricalCrossentropy",
              optimizer="nadam")

When I ask to fit the model

history = model.fit(X_train, y_train, epochs=10, batch_size=32,
                    validation_data=(X_valid, y_valid))

I get the following error:

ValueError: Shapes (None, 18, 14) and (None, 1, 14) are incompatible

By contrast, no error is returned if I use "mse" as the loss function instead (although results are very poor).

For reference, X_train.shape is (18000, 1, 11), whereas y_train.shape is (18000, 18, 14).

Can you help me fix this error?

Thank you very much for your help,


Solution

  • The shapes produced by your network and the shape of your target must match.

    In your case, your network must produce (None, 18, 14). A simple way to do that is using this structure:

    X = np.random.uniform(0,1, (100,1,11))
    Y = np.random.randint(0,1, (100,18,14))
    
    model = Sequential([
        SimpleRNN(20, return_sequences=False, input_shape=[1, 11]),
        RepeatVector(18),
        SimpleRNN(14, return_sequences=True, activation="softmax")])
    
    model.compile(loss="CategoricalCrossentropy", optimizer="nadam")
    history = model.fit(X, Y, epochs=10)
    

    Where we set return_sequences=False and place after a RepeatVector.