In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variable
s can be used to construct Bijector
and TransformedDistribution
objects, i.e.:
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)
I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do
shift.assign(2.)
gx = myBij.forward(x)
I would expect that gx=y+1
, but I see that gx=y
... And indeed, myBij.shift
still evalues to 1
.
If I try to modify the bijector directly, i.e.:
myBij.shift.assign(2.)
I get
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'
Computing gradients also does not work as expected:
with tf.GradientTape() as tape:
gx = myBij.forward(x)
grad = tape.gradient(gx, shift)
Yields None
, as well as this exception when the script ends:
Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'
What am I missing here?
Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...
Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0
If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tf.enable_eager_execution()
shift = tf.Variable(1., dtype=tf.float32)
def f():
myBij = tfp.bijectors.Affine(shift=shift)
# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0., scale=1.),
bijector=myBij,
name="test")
# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse
# transform of myBij:
x = myBij.inverse(y)
return x, y
x, y = f()
shift.assign(2.)
gx, _ = f()
Regarding gradients, you will need to wrap calls to f()
in a GradientTape