Search code examples
pythonnumpytensorflowtensorflow-probability

Using tf.where (or np.where) to draw randomly conditional on an input


I have a TensorFlow vector that only contains 1s and 0s, like a = [0, 0, 0, 1, 0, 1], and conditional on the value of a, I want to draw new random values 0 or 1. If the value of a is 1, I want to draw a new value but if the value of a is 0 I want to leave it alone. So I've tried this:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

# random draw of zeros and ones
a = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

which gives me <tf.Tensor: shape=(6,), dtype=float32, numpy=array([0., 0., 0., 1., 0., 1.], dtype=float32)> then if I redraw

# redraw with a different probability if value is 1. in the original draw
b = tf.where(a == 1.0, tfd.Binomial(total_count = 1., probs = 0.5).sample(1), a)

I would expect tf.where to give me a new vector b that has, on average, half of the 1s become 0s but instead it either returns a copy of a or a vector of all 0s. Example output would be one of b = [0, 0, 0, 0, 0, 0], b = [0, 0, 0, 0, 0, 1], b = [0, 0, 0, 1, 0, 0], or b = [0, 0, 0, 1, 0, 1] . I could of course just use b = tfd.Binomial(total_count = 1.0, probs = 0.25).sample(6) but in my particular case the order of the original vector matters.

A more general situation might use a different distribution so that bit-wise operations can't be easily used. For example

# random draw of normals
a = tfd.Normal(loc = 0., scale = 1.).sample(6)
# redraw with a different scale if value is greater than zero in the original draw
b = tf.where(a > 0, tfd.Normal(loc = 0., scale = 2.).sample(1), a)

Solution

  • APPROACH 1:

    Not tested, but I think the middle param should be a tensor that matches the original one. E.g. 6 elements:

    First, make a second random sequence, of same length:

    a2 = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)
    

    NOTE: If you need a different probability, you simply use that probability when creating a2.

    prob = 0.3
    a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)
    

    Then:

    b = tf.where(a == 1.0, a2, a)
    

    Explanation:

    The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.


    APPROACH 2:

    If that doesn't work, then first param needs to be mapped to a tensor of [true, false, ..]:

    def pos(n):
        return n > 0
    
    cond = list(map(pos,a))   # I don't have TensorFlow handy; may need to replace `list` with appropriate function to create a Tensor.
    
    b = tf.where(cond, a2, 0.0)
    

    APPROACH 3:

    Tested. Doesn't use tf.where.

    First, make a second random sequence, of same length:

    a2 = tfd.Binomial(total_count = 1.0, probs = prob).sample(6)
    

    Then combine the two, "bitwise-and"ing corresponding elements:

    def and2(a, b):
        return (a & b)
    
    b = list(map(and2, a, a2))
    

    NOTE: could alternatively use any other function to combine the two corresponding elements.

    Example data:

    a = [0,0,1,1]
    a2 = [0,1,0,1]
    

    Result:

    b = [0,0,0,1]
    

    Explanation:

    The values in a2 are irrelevant where a is 0, and are "prob" on average where a is 1.