Search code examples
tensorflowkerasrecurrent-neural-networktheory

Keras - Trouble understanding RNN units


I have started Deep Learning few months ago using Tensorflow and tf.keras.

I fully get the concept behind classic Dense Layers or Convolutional/pooling layers where the unit parameter is the number of neurons or filters.

I have moved recently to RNN, but I am confused by this unit parameter.

In the following code from a book example, I am feeding with temporal series of 50 periods but I don't get what the 20 in the SimpleRNN layer is really representing. In an ANN, the first Dense Layer has the same number of parameters as the input, which get me confused.

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),
    keras.layers.SimpleRNN(20, return_sequences=True),
    keras.layers.SimpleRNN(1)
])

And the one with the Dense Layer also:

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),
    keras.layers.SimpleRNN(20),
    keras.layers.Dense(1)
])

Thank you for your help !


Solution

  • For the units in keras.layers.SimpleRNN or any RNN structures that keras provide, you can consider it as the extension of the basic RNN structure that is in a single RNN cell, containing that many number of units for computing inputs. As you know that the unit in RNN is tanh so if units=1 then it will be the graph on the left and for units=3 on the right

          output           output1  output2  output3
            ^                   ^      ^      ^
            |                   |      |      |
           tanh                tanh   tanh   tanh
            ^                   ^      ^      ^
            |                   |      |      |
    input ---          input ------------------
    

    Simply consider it as the filters in CNN, maybe you can find out the concept. Recommend: https://colah.github.io/posts/2015-08-Understanding-LSTMs/