Do I need to add custom gradients for this op?

I have a convolutional layer, which produces 16 output feature maps, and I want to take these maps and transform them into 4 maps like this:

Split 16 feature maps into 4 groups, 4 maps each.
Multiply each group by a mask to zero out some values.
Add the resulting feature maps in each group to get 4 maps.

Or, I can first multiply all 16 maps by a mask, and then split the result into 4 groups to do reduce_sum on each group. The resulting 4 maps will be used as input to the next convolutional or pooling layer.

Will Tensorflow be able to automatically calculate gradient for this combination of tf.split, tf.multiply, and tf.reduce_sum?

EDIT: here's the series of ops, where conv is an output from tf.layers.conv2d, and mask is a binary numpy array of the same shape as conv (full code is here):

conv_masked = mask * conv
conv_grouped = tf.reshape(conv_masked, (batch_size, num_groups, fs*fs, dim, dim))
out = tf.reduce_sum(conv_grouped, axis=2)

Solution

All tensorflow operations already have the gradient formula implemented. As long as all your operations are tf.operation, you are fine.

Also, as you can see here, tensorflow overloads basic operations.

masked_tensor = tensor * mask
masked_tensor = tf.multiply(tensor, mask)

If the elements involved are tensors then the two expressions above are equivalent.

As for the type used for the mask

mask = tf.constant(array)
mask = np.array(array)

For me, using python 3.6.3 and tensorflow 1.3.0 both generated the same result from the operation. But I found nothing in the documentation that explicitly says that np.arrays are always accepted, so I would avoid it.

One point of notice though is that the mask you are multiplying by should be a non-trainable variable. Otherwise the optimizer will alter your mask.