python numpy tensorflow tensorflow-probability

Using tf.where (or np.where) to draw randomly conditional on an input

I have a TensorFlow vector that only contains 1s and 0s, like a = [0, 0, 0, 1, 0, 1], and conditional on the value of a, I want to draw new random values 0 or 1. If the value of a is 1, I want to draw a new value but if the value of a is 0 I want to leave it alone. So I've tried this:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

# random draw of zeros and ones
a = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

which gives me <tf.Tensor: shape=(6,), dtype=float32, numpy=array([0., 0., 0., 1., 0., 1.], dtype=float32)> then if I redraw

# redraw with a different probability if value is 1. in the original draw
b = tf.where(a == 1.0, tfd.Binomial(total_count = 1., probs = 0.5).sample(1), a)

I would expect tf.where to give me a new vector b that has, on average, half of the 1s become 0s but instead it either returns a copy of a or a vector of all 0s. Example output would be one of b = [0, 0, 0, 0, 0, 0], b = [0, 0, 0, 0, 0, 1], b = [0, 0, 0, 1, 0, 0], or b = [0, 0, 0, 1, 0, 1] . I could of course just use b = tfd.Binomial(total_count = 1.0, probs = 0.25).sample(6) but in my particular case the order of the original vector matters.

A more general situation might use a different distribution so that bit-wise operations can't be easily used. For example

# random draw of normals
a = tfd.Normal(loc = 0., scale = 1.).sample(6)
# redraw with a different scale if value is greater than zero in the original draw
b = tf.where(a > 0, tfd.Normal(loc = 0., scale = 2.).sample(1), a)

Solution

APPROACH 1:

Not tested, but I think the middle param should be a tensor that matches the original one. E.g. 6 elements:

First, make a second random sequence, of same length:

a2 = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

NOTE: If you need a different probability, you simply use that probability when creating a2.

prob = 0.3
a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)

Then:

b = tf.where(a == 1.0, a2, a)

Explanation:

The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.

APPROACH 2:

If that doesn't work, then first param needs to be mapped to a tensor of [true, false, ..]:

def pos(n):
    return n > 0

cond = list(map(pos,a))   # I don't have TensorFlow handy; may need to replace `list` with appropriate function to create a Tensor.

b = tf.where(cond, a2, 0.0)

APPROACH 3:

Tested. Doesn't use tf.where.

First, make a second random sequence, of same length:

a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)

Then combine the two, "bitwise-and"ing corresponding elements:

def and2(a, b):
    return (a & b)

b = list(map(and2, a, a2))

NOTE: could alternatively use any other function to combine the two corresponding elements.

Example data:

a = [0,0,1,1]
a2 = [0,1,0,1]

Result:

b = [0,0,0,1]

Explanation:

The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.