python tensorflow machine-learning keras lstm

How to improve LSTM model predictions and accuracy?

After creating pre-embedded layer using gensim my val_accuracy has gone down to 45% for 4600 records:-

model =  models.Sequential()
   
    model.add(Embedding(input_dim=MAX_NB_WORDS, output_dim=EMBEDDING_DIM, 
                         weights=[embedding_model],trainable=False,
                        input_length=seq_len,mask_zero=True))
    #model.add(SpatialDropout1D(0.2))
       
    
    #model.add(Embedding(vocabulary_size, 64))
    model.add(GRU(units=150, return_sequences=True))
    model.add(Dropout(0.4))
    model.add(LSTM(units=200,dropout=0.4))  
    #model.add(Dropout(0.8))
    #model.add(LSTM(100)) 
    #model.add(Dropout(0.4))
    #Bidirectional(tf.keras.layers.LSTM(embedding_dim))
    #model.add(LSTM(400,input_shape=(1117, 100),return_sequences=True))
    #model.add(Bidirectional(LSTM(128)))
    model.add(Dense(100, activation='relu'))
    #
    #model.add(Dropout(0.4))
    #model.add(Dense(200, activation='relu'))
    model.add(Dense(4, activation='softmax'))

    model.compile(loss='categorical_crossentropy', optimizer='rmsprop', 
                  metrics=['accuracy'])

Model: "sequential_4"

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_4 (Embedding)      (None, 50, 100)           2746300   
_________________________________________________________________
gru_4 (GRU)                  (None, 50, 150)           112950    
_________________________________________________________________
dropout_4 (Dropout)          (None, 50, 150)           0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 200)               280800    
_________________________________________________________________
dense_7 (Dense)              (None, 100)               20100     
_________________________________________________________________
dense_8 (Dense)              (None, 4)                 404       
=================================================================
Total params: 3,160,554
Trainable params: 414,254
Non-trainable params: 2,746,300
_________________________________________________________________

Full code is at https://colab.research.google.com/drive/13N94kBKkHIX2TR5B_lETyuH1QTC5VuRf?usp=sharing

It would be great help for me.Since i am new in deep learning and i tried almost everything i knew.But now am all blank.

Solution

The problem is with your input. You've padded your input sequences with zeros but have not provided this information to your model. So your model doesn't ignore the zeros which is the reason it's not learning at all. To resolve this, change your embedding layer as follows:

model.add(layers.Embedding(input_dim=vocab_size+1,
      output_dim=embedding_dim,
      mask_zero=True))

This will enable your model to ignore the zero padding and learn. Training with this, I got a training accuracy of 100% in just 6 epochs though validation accuracy wasn't that good (aroung 54%) which is expected as your training data contains only 32 examples. More about embedding layer: https://keras.io/api/layers/core_layers/embedding/

Since your dataset is small, the model tends to overfit on training data quite easily which gives lower validation accuracy. To mitigate this to some extent, you can try using pre-trained word embeddings like word2vec or GloVe instead of training your own embedding layer. Also, try some text data augmentation methods like creating artificial data using templates or replacing words in training data with their synonyms. You can also experiment with different types of layers (like replacing GRU with another LSTM) but in my opinion that may not help much here and should be considered after trying out pre-trained embeddings and data augmentation.