I have a convolutional layer, which produces 16 output feature maps, and I want to take these maps and transform them into 4 maps like this:
Or, I can first multiply all 16 maps by a mask, and then split the result into 4 groups to do reduce_sum on each group. The resulting 4 maps will be used as input to the next convolutional or pooling layer.
Will Tensorflow be able to automatically calculate gradient for this combination of tf.split, tf.multiply, and tf.reduce_sum?
EDIT: here's the series of ops, where conv
is an output from tf.layers.conv2d
, and mask
is a binary numpy array of the same shape as conv
(full code is here):
conv_masked = mask * conv
conv_grouped = tf.reshape(conv_masked, (batch_size, num_groups, fs*fs, dim, dim))
out = tf.reduce_sum(conv_grouped, axis=2)
All tensorflow operations already have the gradient formula implemented. As long as all your operations are tf.operation, you are fine.
Also, as you can see here, tensorflow overloads basic operations.
masked_tensor = tensor * mask
masked_tensor = tf.multiply(tensor, mask)
If the elements involved are tensors then the two expressions above are equivalent.
As for the type used for the mask
mask = tf.constant(array)
mask = np.array(array)
For me, using python 3.6.3 and tensorflow 1.3.0 both generated the same result from the operation. But I found nothing in the documentation that explicitly says that np.arrays are always accepted, so I would avoid it.
One point of notice though is that the mask you are multiplying by should be a non-trainable variable. Otherwise the optimizer will alter your mask.