Search code examples
tensorflowsamplingmemory-efficientmultinomial

Tensorflow: generate samples from a Multinomial distribution [Space efficient way?]


I have a quick question. How can I sample values in {0, 1} from a Multinomial distribution in TensorFlow? Actually I want a function that does what numpy.multinomial does.

Let's assume for example that I have a vector of counts and a vector of probabilities like this:

counts = [5, 4, 3] # D in my code
probs = [0.1, 0.2, 0.3, 0.1, 0.2, 0.1] # v in my code

then I would like to return a matrix of size (len(counts), len(probs)) = (3, 6) whose sum over each rows = counts.

I looked at the TensorFlow code and I found a way to do what I want to do. Here is my piece of code:

import tensorflow.contrib.distributions as ds

def multinomial_sampling(D, v):
    dist = ds.Multinomial(total_count=D, probs=v)
    return  tf.reshape(tf.reduce_sum(dist._sample_n(1), 0 , False), [-1, v.shape[1]])

Note: I could just to an tf.expand_dims instead of a tf.reshape

The problem is that doing this is not space efficient and when my matrix is big enough TensorFlow just yells at me that I don't have enough memory because he is trying to create a matrix of size [1, 185929, 3390] where 3390 is the length of my probability vector.

So I wanted to do my own implementation of the multinomial sampling but I don't know how to do that and I think there my idea is not efficient enough (in term of time complexity). Here is my skeleton:

probsn = np.random.uniform(size=20)
probsn /= sum(probsn)

counts = tf.Variable([20, 12, 56, 3])
probs = tf.Variable(tf.convert_to_tensor(probsn))

cprobs = tf.cumsum(probs)

out = tf.zeros([tf.shape(counts)[0], tf.shape(probs)[0]])
for i in counts.shape[0]:
    count = tf.gather(counts, i) # get each count
    sample = tf.gather(out, i) # get each row of out

   for j in range(count): # problem here count is a Tensor and not a int
       rdn_number = tf.random_uniform(1)
       for k, prob in enumerate(range(cprobs)): # problem doesn't work in TF
           if  tf.less(rdn_number, prob): 
               tf.scatter_add(out, [i, k], 1)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    r = sess.run(out)
    print(r)

this is a very naive algorithm. I think there might be a better way to decrease the Time Complexity (with a kind of dictionay of range ? that maps to a range of float values the specific indice in the row ? Not sure if such a thing is possible but it would actually avoid me from iterating to find the indice in my row...).

Also, this implementation doesn't work as mentioned in the code because the number I'm iterating over are actually tensors.

Does somebody has a neat implementation of multinomial sampling in TensorFlow?


Solution

  • Ok so apparently my problem is not a problem because I shouldn't have such a big number (185929). So I had edited some other piece of code. For the sake of completness if you want to sample a very large number and if you want to use sample(), you just cannot do that with:

    import tensorflow.contrib.distributions as ds
    
    def multinomial_sampling(D, v):
        dist = ds.Multinomial(total_count=D, probs=v)
        return  tf.reshape(dist.sample(), [-1, v.shape[1]])
    

    if you don't have enough memory on your computer.

    Note: I reshaped my tensor to the same shape so that TensorFlow won't yell at me when I use the output of the multinomial_sampling function in a while loop. Without tf.reshape, in the tf.while_loop, Tensorflow crashes by saying that the I need to provide shape_invariants.

    So you need to actually procede by batches. The idea is to sample a certain batch (like 1000) in a while loop and to decrease the counts at each iteration. Here is the piece of code I've made:

    probsn = np.random.uniform(size=30) 
    probsn /= sum(probsn) # vector of probability of size 30 (sum of the vector = 1)
    
    u = np.random.randint(2000, 3500, size=100) # define number of counts (vector of size 100 with int in 2000, 3500)
    print(u) # should be the same as the output of print(np.sum(res, 1)) of the tf.Session()
    
    counts = tf.Variable(u, dtype=tf.float32)
    probs = tf.Variable(tf.convert_to_tensor(probsn.astype(np.float32)))
    
    import tensorflow.contrib.distributions as ds
    
    dist = ds.Multinomial(total_count=counts, probs=probs)
    
    out = dist.sample()
    samples = tf.zeros((tf.shape(counts)[0], tf.shape(probs)[0]))
    
    def batch_multinomial(counts, probs, samples):
        batch_count = tf.minimum(1000., counts) # use a batch of 1000
        dist = ds.Multinomial(total_count=batch_count, probs=probs)
        samples += dist.sample()
    
        return counts - batch_count, probs, samples
    
    _, _ , samples = tf.while_loop(lambda counts, *args: tf.equal(tf.reduce_all(tf.less(counts, 0.1)), False) , batch_multinomial, [counts, probs, samples])
    
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        res = sess.run(samples)
        print(res)
        print(np.sum(res, 1))