Search code examples
tensorflownlpdeep-learningkeraskeras-layer

Why did the Keras Sequential model give a different result compared to Model model?


I've tried a simple lstm model in keras to do a simple sentiment analysis using imdb dataset using both Sequential model and Model model, and turns out the latter gives a worse result. Here's my code :

model = Sequential()
model.add(Embedding(top_words, embedding_vector_length, input_length=max_review_length))
model.add(LSTM(100))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

It gives a result around 0.6 of accuracy in the first epoch, while the other code that use Model :

_input = Input(shape=[max_review_length], dtype='int32')
embedded = Embedding(
        input_dim=top_words,
        output_dim=embedding_size,
        input_length=max_review_length,
        trainable=False,
        mask_zero=False
    )(_input)
lstm = LSTM(100, return_sequences=True)(embedded)
probabilities = Dense(2, activation='softmax')(lstm)
model = Model(_input, probabilities)
model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

and it gives 0.5 accuracy as a result of the first epoch and never change afterwards.

Any reason for that, or am i doing something wrong? Thanks in advance


Solution

  • I see two main differences between your two models :

    1. You have set the embeddings of the second model as "trainable=False". So you have probably a lot fewer parameters to optimize the second model compared to the first one.
    2. The LSTM is returning the whole sequence in the second model, so the outputs shape will be different, so I don't see how you can compare the two models, they are not doing the same thing.