I am experimenting with recurrent neural network layers in tensorflow & keras and I am having a look at the recurrent_initializer. I wanted to know more about its influence on the layer, so I created a SimpleRnn layer as the follows:
rnn_layer = keras.layers.SimpleRNN(1, return_sequences=True, kernel_initializer = keras.initializers.ones, recurrent_initializer=keras.initializers.zeros, activation="linear")
Running this code, makes the addition in the recurrent net visible:
inp = np.zeros(shape=(1,1,20), dtype=np.float32)
for i in range(20):
inp[0][0][:i] = 5
#inp[0][0][i:] = 0
print(f"i:{i} {rnn_layer(inp)}"'')
output:
i:0 [[[0.]]]
i:1 [[[5.]]]
i:2 [[[10.]]]
i:3 [[[15.]]]
i:4 [[[20.]]]
i:5 [[[25.]]]
i:6 [[[30.]]]
i:7 [[[35.]]]
i:8 [[[40.]]]
i:9 [[[45.]]]
i:10 [[[50.]]]
i:11 [[[55.]]]
i:12 [[[60.]]]
i:13 [[[65.]]]
i:14 [[[70.]]]
i:15 [[[75.]]]
i:16 [[[80.]]]
i:17 [[[85.]]]
i:18 [[[90.]]]
i:19 [[[95.]]]
Now I change the recurrent_initializer to something different, like a glorot_normal distribution:
rnn_layer = keras.layers.SimpleRNN(1, return_sequences=True, kernel_initializer = keras.initializers.ones, recurrent_initializer=keras.initializers.glorot_normal(seed=0), activation="linear")
But I still get the same results. I thought it might depend on some logic, which a Rnn is missing but a LSTM has, so I tried it with an lstm, but still same results. I guess there is something about the recurrent_logic, I still miss. Can someone explain me, what the reccurent_initializers purpose is and how it affects the recurrent layer?
Thanks alot!
Your input to the RNN layer is of shape (1, 1, 20), which mean one Timestep for each batch , the default behavior of RNN is to RESET state between each batch , so you cant see the effect of the recurrent ops(the recurrent_initializers). You have to change the length of the sequence of your input:
inp = np.ones(shape=(5 ,4,1), dtype=np.float32) # sequence length == 4
rnn_layer1 = tf.keras.layers.LSTM(1,return_state=True, return_sequences=False,
kernel_initializer = tf.keras.initializers.ones,
recurrent_initializer=tf.keras.initializers.zeros, activation="linear")
rnn_layer2 = tf.keras.layers.LSTM(1,return_state=True , return_sequences=False,
kernel_initializer = tf.keras.initializers.ones,
recurrent_initializer=tf.keras.initializers.glorot_normal(seed=0),
activation="linear")
first_sample = inp[0 : 1 , : ,: ] #shape(1,4,1)
print(rnn_layer1(first_sample )
print(rnn_layer2(first_sample )