My neural network has a custom layer, which takes an input vector x
, generates a normally distributed tensor A
and returns both A
(used in subsequent layers) and the product Ax
. Assuming I want to reuse the value stored in A
at the output of the custom layer, in a second different layer, is there any subtle aspect that I need to factor while determining which Keras backend function (K.backend.random_normal
or K.backend.random_normal_variable
) I should use in order to generate A
?
a) The backend function random_normal
returns a tensor storing a different value following each call (see code snippet below). To me, this suggests that random_normal
acts as a generator of normally distributed values. Does this mean that one should not use random_normal
to generate a normally distributed tensor if they want to hold its value following calls?
b) The backend function random_normal_variable
appears safer (see code snippet below) as it retains value across calls.
Is my conceptual understanding correct? Or am I missing something basic? I am using Keras 2.1.2 and Tensorflow 1.4.0.
Experiment with random_normal
(value changes across calls):
In [5]: A = K.random_normal(shape = (2,2), mean=0.0, stddev=0.5)
In [6]: K.get_value(A)
Out[6]: array([[ 0.4459489 , -0.82019573],
[-0.39853573, -0.33919844]], dtype=float32)
In [7]: K.get_value(A)
Out[7]: array([[-0.37467018, 0.42445764],
[-0.573843 , -0.3468301 ]], dtype=float32)
Experiment with random_normal_variable
(value holds across calls):
In [9]: B = K.random_normal_variable(shape=(2,2), mean=0., scale=0.5)
In [10]: K.get_value(B)
Out[10]: array([[ 0.07700552, 0.28008622],
[-0.69484973, -1.32078779]], dtype=float32)
In [11]: K.get_value(B)
Out[11]: array([[ 0.07700552, 0.28008622],
[-0.69484973, -1.32078779]], dtype=float32)
From my understanding, this is due to the fact that random_normal_variable
returns an instantiated Variable
while random_normal
returns a Tensor
.
K.random_normal(shape=(2,2), mean=0.0, stddev=0.5)
<tf.Tensor 'random_normal:0' shape=(2, 2) dtype=float32>
K.random_normal_variable(shape=(2,2), mean=0.0, scale=0.5)
<tf.Variable 'Variable:0' shape=(2, 2) dtype=float32_ref>
As for why the values vary for the Tensor
and not for the Variable
, I think the answer to this thread sums it up well:
Variable
is basically a wrapper onTensor
that maintains state across multiple calls to run [...]
The answer also mentions that the variable needs to be initialized to evaluate it, which is the case here as you noticed (since you did not initialize the variable to evaluate it). In fact, the returned variable is already initialized thanks to a call to tensorflow.random_normal_initializer
within the random_normal_variable
function. Hope this clarifies why your code has this behaviour.