I would like to perform spacial pyramid pooling in tensorflow. This has already been answered there (and other questions in Stackoverflow.com), but the proposed solution doesn't work with unknown input shape.
Is there an implementation that handles unknown shapes at graph definition?
To address this issue, I came up with a different implementation that uses a mask, rescaled using nearest neighbor:
def avg_spp(self, input, scale, name, padding=DEFAULT_PADDING):
eye = tf.eye(scale*scale, batch_shape=(tf.shape(input)[0],))
mask = tf.reshape(eye, (-1, scale, scale, scale*scale))
mask = tf.image.resize_nearest_neighbor(mask, tf.shape(input)[1:3])
spp = tf.multiply(tf.expand_dims(input, 4), tf.expand_dims(mask, 3))
spp = tf.divide(tf.reduce_sum(spp, axis=[1,2]), tf.cast(tf.count_nonzero(spp, axis=[1,2]), tf.float32))
spp = tf.reshape(spp, (-1, tf.shape(input)[3], scale, scale))
spp = tf.transpose(spp, [0,2,3,1], name=name)
return spp