Search code examples
tensorflowlstmtensorflow2.0tf.kerasbatch-normalization

How to apply Layer Normalisation in LSTMCell


I want to apply Layer Normalisation to recurrent neural network while using tf.compat.v1.nn.rnn_cell.LSTMCell.

There is a LayerNormalization class but how should I apply this in LSTMCell. I am using tf.compat.v1.nn.rnn_cell.LSTMCell because I want to use projection layer. How should I achieve Normalisation in this case.

class LM(tf.keras.Model):
  def __init__(self, hidden_size=2048, num_layers=2):
    super(LM, self).__init__()
    self.hidden_size = hidden_size
    self.num_layers = num_layers
    self.lstm_layers = []
    self.proj_dim = 640
    for i in range(self.num_layers):
        name1 = 'lm_lstm'+str(i)
        self.cell = tf.compat.v1.nn.rnn_cell.LSTMCell(2048, num_proj=640)
        self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))

  def call(self, x):
    for i in range(self.num_layers):
      output = self.lstm_layers[i](x)
      x = output
    state_h = ""
    return x, state_h

Solution

  • It depends whether you want to apply the normalization at cell level or at layer level - I'm not sure which one is the correct way to do it - the paper doesn't specify it. Here is an older implementation that you might use for inspiration.

    To normalize at cell level, you probably need to create a custom RNNCell and implement the normalization there.

    P.S. You might also be able to apply LayerNormalization to the output of RNN, for example like shown below, but you'll need to think carefully about whether it has the desired effect, especially given the variable shapes inherent to sequence models.

    self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))
    self.lstm_layers.append(tf.keras.layers.LayerNormalization())