Search code examples
python-3.xkerasnlplstmfaster-rcnn

How do I add a CNN layer before Bi-LSTM layer


I want to add a CNN layer with max-pooling before a Bi-LSTM layer for a sentiment classification task but I am getting an error.

Here is the code I am using.

model = Sequential()
model.add(Embedding(max_words, 30, input_length=max_len))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Conv1D(32, kernel_size=3, activation='relu'))
model.add(GlobalMaxPooling1D())
model.add(Flatten())
model.add(Bidirectional(LSTM(32, return_sequences=True)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.8))
model.add(Dense(1, activation='sigmoid'))
model.summary()

This is the error I am getting

ValueError                                Traceback (most recent call last)
<ipython-input-64-49cde447597a> in <module>()
      6 model.add(Conv1D(32, kernel_size=3, activation='relu'))
      7 model.add(GlobalMaxPooling1D())
----> 8 model.add(Flatten())
      9 model.add(Bidirectional(LSTM(32, return_sequences=True)))
     10 model.add(BatchNormalization())

2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in assert_input_compatibility(self, inputs)
    356                                      self.name + ': expected min_ndim=' +
    357                                      str(spec.min_ndim) + ', found ndim=' +
--> 358                                      str(K.ndim(x)))
    359             # Check dtype.
    360             if spec.dtype is not None:

ValueError: Input 0 is incompatible with layer flatten_3: expected min_ndim=3, found ndim=2

Solution

  • this is what I suggest you... remove flatten and global pooling in order to maintain the embedding in 3d format and properly fit the LSTM. I also set the return sequence to False because your is a sentiment classifier and suppose that your output is 2D

    max_words = 111
    max_len = 50
    
    model = Sequential()
    model.add(Embedding(max_words, 30, input_length=max_len))
    model.add(SpatialDropout1D(0.5))
    model.add(Conv1D(32, kernel_size=3, activation='relu'))
    model.add(Bidirectional(LSTM(32, return_sequences=False)))
    model.add(BatchNormalization())
    model.add(Activation('tanh'))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))
    model.summary()