python tensorflow keras lstm conv-neural-network

Passing output of a CNN to BILSTM

I am working on a project in which I have to pass the output of CNN to Bi directional LSTM. I created the model as below but it is throwing 'incompatible' error. Please let me know where I am going wrong and how to fix this


    model = Sequential()
    model.add(Conv2D(filters = 16, kernel_size = 3,input_shape = (32,32,1)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2),strides=1, padding='valid'))
    model.add(Activation('relu'))
    
    model.add(Conv2D(filters = 32, kernel_size=3))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 48, kernel_size=3))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2,2)))
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 64, kernel_size=3))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(Dropout(0.25))
    model.add(Conv2D(filters = 80, kernel_size=3))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(Bidirectional(LSTM(150, return_sequences=True)))
    model.add(Dropout(0.3))
    model.add(Bidirectional(LSTM(96)))
    model.add(Dense(total_words/2, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
    model.add(Dense(total_words, activation='softmax'))
    
    model.summary()

The error returned is:


    ValueError                                Traceback (most recent call last)
    <ipython-input-24-261befed7006> in <module>()
         27 model.add(Activation('relu'))
         28 
    ---> 29 model.add(Bidirectional(LSTM(150, return_sequences=True)))
         30 model.add(Dropout(0.3))
         31 model.add(Bidirectional(LSTM(96)))
    
    5 frames
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
        178                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
        179                          str(ndim) + '. Full shape received: ' +
    --> 180                          str(x.shape.as_list()))
        181     if spec.max_ndim is not None:
        182       ndim = x.shape.ndims
    
    ValueError: Input 0 of layer bidirectional is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 1, 80]

Solution

The problem is the data passed to LSTM and it can be solved inside your network. The LSTM expects 3D data while Conv2D produces 4D. There are two possibilities you can adopt:

1) make a reshape (batch_size, H, W*channel);

2) make a reshape (batch_size, W, H*channel).

In these ways, you have 3D data to use inside your LSTM. below an example

def ReshapeLayer(x):
    
    shape = x.shape
    
    # 1 possibility: H,W*channel
    reshape = Reshape((shape[1],shape[2]*shape[3]))(x)
    
    # 2 possibility: W,H*channel
    # transpose = Permute((2,1,3))(x)
    # reshape = Reshape((shape[1],shape[2]*shape[3]))(transpose)
    
    return reshape

model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = 3, input_shape = (32,32,3)))
model.add(Lambda(ReshapeLayer))  # <============
model.add(LSTM(16))
model.add(Dense(units=2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',)
model.summary()