python tensorflow integer placeholder batchsize

Input an integer with placeholder in tensorflow?

I want to feed a batch_size integer as a placeholder in Tensorflow. But it does not act as an integer. Consider the following example:

import tensorflow as tf


max_length = 5
batch_size = 3

batch_size_placeholder = tf.placeholder(dtype=tf.int32)

mask_0 = tf.one_hot(indices=[0]*batch_size_placeholder, depth=max_length, on_value=0., off_value=1.)
mask_1 = tf.one_hot(indices=[0]*batch_size, depth=max_length, on_value=0., off_value=1.)

# new session
with tf.Session() as sess:
    feed = {batch_size_placeholder : 3}

    batch, mask0, mask1 = sess.run([
                     batch_size_placeholder, mask_0, mask_1
    ], feed_dict=feed)

When I print the values of batch, mask0 and mask1 I have the following:

print(batch)
>>> array(3, dtype=int32)

print(mask0)
>>> array([[0., 1., 1., 1., 1.]], dtype=float32)

print(mask1)
>>> array([[0., 1., 1., 1., 1.],
           [0., 1., 1., 1., 1.],
           [0., 1., 1., 1., 1.]], dtype=float32)

Indeed I thought mask0 and mask1 must be the same, but it seems that Tensorflow does not treat batch_size_placeholder as an integer. I believe it would be a tensor, but is there anyway that I can use it as an integer in my computations?

Is there anyway I can fix this problem? Just FYI, I used tf.one_hot as just an example, I want to run train/validation during training in my code where I will need a lot of other computations with different values for batch_size in training and in validation steps.

Any help would be appreciated.

Solution

In pure python usage, [0]*3 will be [0,0,0]. However, batch_size_placeholder is a placeholder, during the graph execution, it will be a tensor. [0]*tensor will be regarded as tensor multiplication. In your case, it will be a 1-d tensor which has 0 value. To correctly use batch_size_placeholder, you should create a tensor which has the same length as batch_size_placeholder.

mask_0 = tf.one_hot(tf.zeros(batch_size_placeholder, dtype=tf.int32), depth=max_length, on_value=0., off_value=1.)

It will have the same result as mask_1.

A simple example to show the difference.

batch_size_placeholder = tf.placeholder(dtype=tf.int32)

a = [0]*batch_size_placeholder
b = tf.zeros(batch_size_placeholder, dtype=tf.int32)
with tf.Session() as sess:
    print(sess.run([a, b], feed_dict={batch_size_placeholder : 3}))

# [array([0], dtype=int32), array([0, 0, 0], dtype=int32)]