Search code examples
pythontensorflowmachine-learningtensorflow-probability

Use and modify variables in tensorflow bijectors


In the reference paper for TensorFlow Distributions (now Probability), it is mentioned that TensorFlow Variables can be used to construct Bijector and TransformedDistribution objects, i.e.:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

tf.enable_eager_execution()

shift = tf.Variable(1., dtype=tf.float32)
myBij = tfp.bijectors.Affine(shift=shift)

# Normal distribution centered in zero, then shifted to 1 using the bijection
myDistr = tfd.TransformedDistribution(
            distribution=tfd.Normal(loc=0., scale=1.),
            bijector=myBij,
            name="test")

# 2 samples of a normal centered at 1:
y = myDistr.sample(2)
# 2 samples of a normal centered at 0, obtained using inverse transform of myBij:
x = myBij.inverse(y)

I would now like to modify the shift variable (say, I might compute gradients of some likelihood function as a function of the shift and update its value) so I do

shift.assign(2.)
gx = myBij.forward(x)

I would expect that gx=y+1, but I see that gx=y... And indeed, myBij.shift still evalues to 1.

If I try to modify the bijector directly, i.e.:

myBij.shift.assign(2.)

I get

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Computing gradients also does not work as expected:

with tf.GradientTape() as tape:
    gx = myBij.forward(x)
grad = tape.gradient(gx, shift)

Yields None, as well as this exception when the script ends:

Exception ignored in: <bound method GradientTape.__del__ of <tensorflow.python.eager.backprop.GradientTape object at 0x7f529c4702e8>>
Traceback (most recent call last):
File "~/.local/lib/python3.6/site-packages/tensorflow/python/eager/backprop.py", line 765, in __del__
AttributeError: 'NoneType' object has no attribute 'context'

What am I missing here?

Edit: I got it working with a graph/session, so it seems there is an issue with eager execution...

Note: I have tensorflow version 1.12.0 and tensorflow_probability version 0.5.0


Solution

  • If you are using eager mode, you will need to recompute everything from the variable forward. Best to capture this logic in a function;

    import tensorflow as tf
    import tensorflow_probability as tfp
    tfd = tfp.distributions
    
    tf.enable_eager_execution()
    
    shift = tf.Variable(1., dtype=tf.float32)
    def f():
      myBij = tfp.bijectors.Affine(shift=shift)
    
      # Normal distribution centered in zero, then shifted to 1 using the bijection
      myDistr = tfd.TransformedDistribution(
                distribution=tfd.Normal(loc=0., scale=1.),
                bijector=myBij,
                name="test")
    
      # 2 samples of a normal centered at 1:
      y = myDistr.sample(2)
      # 2 samples of a normal centered at 0, obtained using inverse
      # transform of myBij:
      x = myBij.inverse(y)
      return x, y
    x, y = f()
    shift.assign(2.)
    gx, _ = f()
    

    Regarding gradients, you will need to wrap calls to f() in a GradientTape