Search code examples
pythonmachine-learningkeraslstmkeras-layer

How can a LSTM have an output_dim different than the input_dim of the next layer?


I was looking at this code:

model = Sequential()
model.add(LSTM(input_shape = (1,), input_dim=1, output_dim=6, return_sequences=True))
model.add(LSTM(input_shape = (1,), input_dim=1, output_dim=6, return_sequences=False))
model.add(Dense(1))
model.add(Activation('linear'))

How can the first layer have an output of dim=6 and then the next layer have an input_dim=1?

Edit

The code is wrong and Keras simply tries its best as shown in the actual generated model (see how the model doesn't match the code): enter image description here


Solution

  • This code is very confusing and should never be written like this.

    In sequential model, Keras respects the input_shape of the first layer only. All subsequent layers are initialized with the output of the previous layer, effectively ignoring input_shape specification. Source code: keras/models.py. In this case, it's (None, None, 6).

    The model summary thus looks like this:

    Layer (type)                 Output Shape              Param #   
    =================================================================
    lstm_1 (LSTM)                (None, None, 6)           192       
    _________________________________________________________________
    lstm_2 (LSTM)                (None, 6)                 312       
    =================================================================
    

    By the way, keras spits out warnings on this LSTM specification, because input_dim is deprecated:

    Update your LSTM call to the Keras 2 API: LSTM(input_shape=(None, 1), return_sequences=True, units=6)

    Update your LSTM call to the Keras 2 API: LSTM(input_shape=(None, 1), return_sequences=False, units=6)