deep-learning lstm recurrent-neural-network autoencoder

Error Received while building the Auto encoder

I am trying to build an auto encoder for my term project using CNN as Encoder and LSTM as Decoder, how ever when i display the summary of the model. I receive the following error:

ValueError: Input 0 is incompatible with layer lstm_10: expected ndim=3, found ndim=2

x.shape = (45406, 100, 100)
y.shape = (45406,)

I already tried changing the shape of the input for the LSTM, but it didn't work.

def keras_model(image_x, image_y):

model = Sequential()
model.add(Lambda(lambda x: x / 127.5 - 1., input_shape=(image_x, image_y, 1)))

last = model.output
x = Conv2D(3, (3, 3), padding='same')(last)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), padding='valid')(x)

encoded= Flatten()(x)
x = LSTM(8, return_sequences=True, input_shape=(100,100))(encoded)
decoded = LSTM(64, return_sequences = True)(x)

x = Dropout(0.5)(decoded)
x = Dense(400, activation='relu')(x)
x = Dense(25, activation='relu')(x)
final = Dense(1, activation='relu')(x)

autoencoder = Model(model.input, final)

autoencoder.compile(optimizer="Adam", loss="mse")
autoencoder.summary()

model= keras_model(100, 100)

Solution

Given you are using an LSTM, you need a time dimension. So your input shape should be: (time, image_x, image_y, nb_image_channels).

I would suggest to get a more in-depth understanding of autoencoders, LSTM and 2D Convolution as all these play together here. This is a helpful intro: https://machinelearningmastery.com/lstm-autoencoders/ and this https://blog.keras.io/building-autoencoders-in-keras.html).

Also have a look at this example, someone implemented an LSTM with Conv2D How to reshape 3 channel dataset for input to neural network. The TimeDistributed layer comes in useful here.

However, just to get your error fixed you can add a Reshape() layer to fake the extra dimension:

def keras_model(image_x, image_y):

    model = Sequential()
    model.add(Lambda(lambda x: x / 127.5 - 1., input_shape=(image_x, image_y, 1)))

    last = model.output
    x = Conv2D(3, (3, 3), padding='same')(last)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((2, 2), padding='valid')(x)

    encoded= Flatten()(x)
    # (50,50,3) is the output shape of the max pooling layer (see model summary)
    encoded = Reshape((50*50*3, 1))(encoded)
    x = LSTM(8, return_sequences=True)(encoded)  # input shape can be removed
    decoded = LSTM(64, return_sequences = True)(x)

    x = Dropout(0.5)(decoded)
    x = Dense(400, activation='relu')(x)
    x = Dense(25, activation='relu')(x)
    final = Dense(1, activation='relu')(x)

    autoencoder = Model(model.input, final)

    autoencoder.compile(optimizer="Adam", loss="mse")
    print(autoencoder.summary())

model= keras_model(100, 100)