Search code examples
tensorflowmachine-learningdeep-learninglstmhyperparameters

LSTM Arguments modifications


I'm working on LSTM code and trying to make my model accurate. I've been trying to change the arguments* and the number of epochs and batch size in vain. Probably, I'm doing it wrong! Any help ? please share with me any tutorial or guide that could be helpful. Thank you

*LSTM arguments

tf.keras.layers.LSTM(
    units, activation='tanh', recurrent_activation='sigmoid', use_bias=True,
    kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal',
    bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None,
    recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None,
    kernel_constraint=None, recurrent_constraint=None, bias_constraint=None,
    dropout=0.0, recurrent_dropout=0.0, implementation=2, return_sequences=False,
    return_state=False, go_backwards=False, stateful=False, time_major=False,
    unroll=False, **kwargs
) 

Solution

  • Everyone may have hard time to understand and work with recurrent neural networks. However they are not that difficult as they seem to be.

    For understanding recurrent neural networks and LSTMs from scratch, I think the best blog for this is the Colah blog.

    You can also see this article that summarizes concepts of RNNs.

    This tutorial in keras blog may be useful for implementing an RNN.

    Finally, to understand LSTM layers think of them as a simple Dense layers with units as the size of the layer.

    The special thing about these layers is how they work, this is where other arguments come. Here I will just the ones that I have used.

    units: Size of the layer
    Activation: Activation function to apply on the output of the layer
    use_bias: Boolean, decides if to use a vector for bias or not
    return_sequences: Boolean, if you have Many to One RNN set it to False, If Many to Many RNN set it to True
    

    EDIT: This is a code of a constitutional recurrent neural network that I've built for image classification. I hope it's what you are searching.

     model = Sequential()
            model.add(Input(shape=(IMG_HEIGHT, IMG_WIDTH, 3)))
            model.add(Reshape(target_shape=(IMG_HEIGHT, IMG_WIDTH * 3)))
            model.add(Conv1D(filters=64, kernel_size=3, padding="same", activation='relu',
                             input_shape=(IMG_HEIGHT, IMG_WIDTH * 3), data_format='channels_last'))
            
            model.add(Conv1D(filters=64, kernel_size=3, padding="same", activation='relu'))
            model.add(MaxPooling1D(pool_size=3))
    
            model.add(Conv1D(filters=128, kernel_size=3, padding="same", activation='relu'))
            model.add(Conv1D(filters=128, kernel_size=3, padding="same", activation='relu'))
            model.add(LSTM(64, activation='relu'))
            model.add(BatchNormalization())
            model.add(Flatten())
            model.add(Dense(4, activation='softmax'))
            model.build(input_shape=(batch_size, IMG_HEIGHT, IMG_WIDTH, 3))
            model.summary() 
    

    I hope this helps.