Search code examples
kerasclassificationlstmconv-neural-network

Sequence to Sequence classification with CNN-LSTM model in keras


I'm working with 1000 samples. Each sample is associated with a person who has 70 different vital signs and health features measured at 168 different time steps. Then, for each time step, I should predict a binary label. So, the input and output shapes are:

Input.shape = (1000, 168, 70)
Output.shape = (1000, 168, 1) 

The goal is to use CNN to extract features, then apply LSTM for the temporal information. Then I want to apply a dense layer for binary classification. I want to apply CNN-LSTM model for this task.

The following is the code I tried.

model = Sequential()                        
model.add(Conv1D(filters=16, kernel_size=5, strides=1, padding="same", input_shape=(168, 70), activation='relu'))
model.add(MaxPooling1D())
model.add(LSTM(64, return_sequences=True))
model.add(Dense(1, activation="sigmoid")) 

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train_3D, Y_train_3D, batch_size=32, epochs=500, validation_data=(X_val_3D, Y_val_3D))

I'm new to the application of this model so I'm sure I'm doing something wrong here which I cannot find. Here is the error:

ValueError: logits and labels must have the same shape ((None, 84, 1) vs (None, 168, 1))

Solution

  • Since you are using return_sequences=True, this means LSTM will return the output with shape (batch_size, 84, 64). The 84 here comes due to Conv1D parameters you used. So when you apply Dense layer with 1 units, it reduces the last dimension to 1, which means (batch_size, 84, 64) will become (batch_size, 84, 1) after Dense layer application. You either should not use return_sequences=True or use another layer/layers to flatten the output to 2 dimensions before feeding it to Dense layer.