python tensorflow keras recurrent-neural-network multiclass-classification

Multi Dimension Y_train on Keras

i have 2 corpus for x_train and y_train, and after some treatment like this :

input_sequences = []
labels = []

indexCA = 0

for line in corpusMSA:
    lineCA = corpusCA[indexCA].split() # Save CA Line
    token_list = tokenizer.texts_to_sequences([line])[0] # Tokenize line
    for i in range(1, len(token_list)):
        n_gram_sequence = token_list[:i+1] # Generate ngrams (n=2)
        n_gram_label = lineCA[:i+1]
        input_sequences.append(n_gram_sequence)
        labels.append(n_gram_label)
    indexCA+=1

# pad sequences 
max_sequence_len = max([len(x) for x in input_sequences])
input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_sequence_len, padding='pre'))

max_labels_len = max([len(x) for x in labels])
labels = np.array(pad_sequences(labels, maxlen=max_labels_len, padding='pre'))

# create predictors and label
xs = input_sequences
ys = tf.keras.utils.to_categorical(labels, num_classes=16)

the original shape of both dataset are (1098360, 14), but after using utils.to_categorical() methode the y_train shape become (1098360, 14, 16).

i have 2 Bidirectional LSTM layer :

model.add(Embedding(total_words, 100, input_length=max_sequence_len))
model.add(Bidirectional(LSTM(256, return_sequences=True)))
model.add(Bidirectional(LSTM(128)))
model.add(Dense(16, activation='softmax'))
adam = Adam(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=20, batch_size=size_batch, verbose=1, callbacks=[tensorboard])

and i have this error : A target array with shape (1098360, 14, 16) was passed for an output of shape (None, 16) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.

how can i tell my model that the output shape are (None,14,16) ?

Solution

y_train before calling to_categorical seems to be a vector already so you don't need to use to_categorical however, if that vector contains more than one class in the case of mutlilabel classification then you need to use to_categorical then use np.sum(axis=1)
end result would be like so:
y_train = to_categorical(y_train, num_classes=16).sum(axis=1)