Search code examples
tensorflowkerasneural-networkrecurrent-neural-network

How do I get network activations of all units in all layers in a network in all timesteps?


I would like to inspect the activities of all the units in all layers of a recurrent neural network over many timesteps.

In the code below I created a Keras model with a SimpleRNN and a Dense layer.

If I use the paramater return_sequences=True when initializing the RNN, I can get the activities of the RNN if I do rnn(inputs), for any appropriate inputs array. And I also can get the activities over time of the output unit by doing model(inputs).

But if I want both, doing both rnn(inputs) and model(inputs) makes the computation be done twice. Is there a way to avoid doing the computation twice while having access to the activities of all units over time? Thank you!

SEED=42
tf.random.set_seed(SEED)
np.random.seed(SEED)

timesteps = 3
embedding_dim = 4
units = 2
num_samples = 5

input_shape = (num_samples, timesteps, embedding_dim)
model = Sequential([
    SimpleRNN(units, stateful=True, batch_input_shape=input_shape, return_sequences=True, activation="linear", 
              recurrent_initializer="identity", bias_initializer="ones"), 
    Dense(1)])

some_initial_state = np.ones((num_samples, units))
some_initial_state[0,0] = 0.123
rnn = model.layers[0]
rnn.reset_states(states=some_initial_state)


some_initial_state, rnn(np.zeros((num_samples, timesteps, embedding_dim))), model(np.zeros((num_samples, timesteps, embedding_dim)))

With the following output:

(array([[0.123, 1.   ],
    [1.   , 1.   ],
    [1.   , 1.   ],
    [1.   , 1.   ],
    [1.   , 1.   ]]),
<tf.Tensor: shape=(5, 3, 2), dtype=float32, numpy=
array([[[1.123    , 2.       ],
     [2.1230001, 3.       ],
     [3.1230001, 4.       ]],

    [[2.       , 2.       ],
     [3.       , 3.       ],
     [4.       , 4.       ]],

    [[2.       , 2.       ],
     [3.       , 3.       ],
     [4.       , 4.       ]],

    [[2.       , 2.       ],
     [3.       , 3.       ],
     [4.       , 4.       ]],

    [[2.       , 2.       ],
     [3.       , 3.       ],
     [4.       , 4.       ]]], dtype=float32)>,
<tf.Tensor: shape=(5, 3, 1), dtype=float32, numpy=
array([[[1.971611 ],
     [2.4591472],
     [2.9466834]],

    [[2.437681 ],
     [2.9252172],
     [3.4127533]],

    [[2.437681 ],
     [2.9252172],
     [3.4127533]],

    [[2.437681 ],
     [2.9252172],
     [3.4127533]],

    [[2.437681 ],
     [2.9252172],
     [3.4127533]]], dtype=float32)>)

Solution

  • You will need a model with multiple outputs using the Functional API, which would look like this:

    SEED=42
    tf.random.set_seed(SEED)
    np.random.seed(SEED)
    
    timesteps = 3
    embedding_dim = 4
    units = 2
    num_samples = 5
    
    inputs = Input(batch_shape=(num_samples, timesteps, embedding_dim))
    # initial state as Keras Input
    initial_state = Input((units,))
    rnn = SimpleRNN(units, stateful=True, return_sequences=True, activation="linear", 
                    recurrent_initializer="identity", bias_initializer="ones")
    hidden = rnn(inputs, initial_state=initial_state)
    dense = Dense(1)(hidden)
    
    # The initial state is a extra input and the model has two outputs
    model = Model([inputs, initial_state], outputs=[hidden, dense])
    
    some_input = np.zeros((num_samples, timesteps, embedding_dim))
    some_initial_state = np.ones((num_samples, units))
    some_initial_state[0,0] = 0.123
    rnn_output, dense_output = model([some_input, some_initial_state])
    
    some_initial_state, rnn_output, dense_output
    

    Note that you don't need to use a stateful RNN to set the initial states using the Functional API. Also, by running a forward pass twice in your example, the second output would correspond to a different RNN state (which I believe is not the desired result).