Search code examples
pythontensorflowneural-networkrecurrent-neural-network

What's default initial_state in tf.nn.dynamic_rnn


Usually, we would use cell.zero_state as the initial_state of tf.nn.dynamic_rnn.

Now, I'm wondering what's default initial_state in tf.nn.dynamic_rnn if we don't set initial_state.

The most similar question I can find is Setting initial state in dynamic RNN

But I can't understand what does scratch mean in the answer:

If you don't set the initial_state, it will be trained from scratch as other weight matrices do.


Solution

  • If there is no initial_state, dynamic_rnn will try to call cell.get_initial_state(inputs=None, batch_size=batch_size, dtype=dtype) to set initial_state, and if cell.get_initial_state is not defined, cell.zero_state is used. See the source code of dynamic_rnn https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/ops/rnn.py#L671

    For most default cell implementations, cell.get_initial_state function is defined the same as cell.zero_state if inputs is none. For example https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/ops/rnn_cell_impl.py#L281-L309

    In conclusion, cell.zero_state is used whether you set the initial_state or not. But you can build your own cell and re-implement cell.get_initial_state.