Search code examples
keraslstmrecurrent-neural-network

How to stack same RNN for every layer?


I would like to know how to stack many layers of RNN but every layer are the same RNN. I want every layer share the same weight. I have read stack LSTM and RNN, but I found that each layer was not the same.

1 layer code:

inputs = keras.Input(shape=(maxlen,), batch_size = batch_size)

Emb_layer = layers.Embedding(max_features,word_dim)
Emb_output = Emb_layer(inputs)

first_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=False,stateful =False)
first_layer_output = first_layer(Emb_output)

dense_layer = layers.Dense(1, activation='sigmoid')
dense_output = dense_layer(first_layer_output )

model = keras.Model(inputs=inputs, outputs=dense_output)
model.summary()

enter image description here RNN 1 layer

inputs = keras.Input(shape=(maxlen,), batch_size = batch_size)

Emb_layer = layers.Embedding(max_features,word_dim)
Emb_output = Emb_layer(inputs)

first_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=True,stateful =True)
first_layer_output = first_layer(Emb_output)
first_layer_state = first_layer.states

second_layer = layers.SimpleRNN(n_hidden,use_bias=True,return_sequences=False,stateful =False)
second_layer_set_state = second_layer(first_layer_output, initial_state=first_layer_state)

dense_layer = layers.Dense(1, activation='sigmoid')
dense_output = dense_layer(second_layer_set_state )

model = keras.Model(inputs=inputs, outputs=dense_output)
model.summary()

enter image description here Stack RNN 2 layer.

For example, I want to build two layers RNN, but the first layer and the second must have the same weight, such that when I update the weight in the first layer the second layer must be updated and share the same value. As far as I know, TF has RNN.state. It returns the value from the previous layer. However, when I use this, it seems that each layer is treated independently. The 2-layer RNN that I want should have trainable parameters equal to the 1-layer since they shared the same weight, but this did not work.


Solution

  • You can view the layer object as a container for the weights that knows how to apply the weights. You can use the layer object as many times as you want. Assuming the embedding and the RNN dimension are the same, you can do:

    states = Emb_layer(inputs)
    first_layer = layers.SimpleRNN(n_hidden, use_bias=True, return_sequences=True)
    for _ in range(10):
        states = first_layer(states)
    

    There is no reason to set stateful to true. This is used when you split long sequences into multiple batches and what the RNN to remember the state between batches, so you do not have yo manually set initial states. You can get the final state of the RNN (that you wany you want to use for classification) by simply indexing the last position from states.