Search code examples
pythondeep-learninglstm

Understand the summary of a LSTM model


I have the following LSTM model. Can somebody helps me understand the summary of the model?

a) How the param# are calculated?

b) We have no value?

c) the param# near the dropoout why is 0?

model = Sequential()
model.add(LSTM(64, return_sequences=True, recurrent_regularizer=l2(0.0015), input_shape=(timestamps, 
input_dim)))
model.add(Dropout(0.5))
model.add(LSTM(64, recurrent_regularizer=l2(0.0015), input_shape=(timesteps,input_dim)))


model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))

model.add(Dense(n_classes, activation='softmax'))
model.summary()

The following are the input, timestamps, and x_train

timesteps=100
input_dim= 6
X_train=1120

The summary is:

    Layer (type)                 Output Shape              Param #   
=================================================================
    lstm_1 (LSTM)                (None, 100, 64)           18176     
_________________________________________________________________
    dropout_1 (Dropout)          (None, 100, 64)           0         
_________________________________________________________________
    lstm_2 (LSTM)                (None, 64)                33024     
_________________________________________________________________
    dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
    dense_2 (Dense)              (None, 64)                4160      
_________________________________________________________________
    dense_3 (Dense)              (None, 6)                 390       
=================================================================
    Total params: 59,910
    Trainable params: 59,910
    Non-trainable params: 0

Solution

  • Part of your question is answered here.

    https://datascience.stackexchange.com/questions/10615/number-of-parameters-in-an-lstm-model

    Simply put, the reason there are so many parameters for an LSTM model is because you have tons of data in your model and many weights need to be trained to fit the model.

    Dropout layers don't have parameters because there are no weights in a dropout layer. All a dropout layer does is give a % chance that a neuron won't be included during testing. In this case, you've chosen 50%. Beyond that, there is nothing to configure in a dropout layer.