Search code examples
tensorflowtorch

tensorflow version of Torch nn.DepthConcat


Torch has a function nn.DepthConcat which is similar nn.Concat except it pads with zeros to make all of the non-channel dims be the same size. I have been trying to get this going in tensorflow with little luck. If I know the sizes of all of the tensors at graph building time this seems to work:

    def depthconcat(inputs):
        concat_dim = 3
        shapes = []
        for input_ in inputs:
            shapes.append(input_.get_shape())
        shape_tensor = tf.pack(shapes)
        max_dims = tf.reduce_max(shape_tensor, 0)

        padded_inputs = []
        for input_ in inputs:
            paddings = max_dims - input_.get_shape()
            padded_inputs.append(tf.pad(input_, paddings))
        return tf.concat(concat_dim, padded_inputs)

However, if the shape is determined at at run time I get the following error:

    Tensors in list passed to 'values' of 'Pack' Op have types [<NOT CONVERTIBLE TO TENSOR>, <NOT CONVERTIBLE TO TENSOR>, <NOT CONVERTIBLE TO TENSOR>, <NOT CONVERTIBLE TO TENSOR>] that don't all match.

It seems like it is able to convert the TensorShape object into a tensor if it is fully defined at graph building time. Any suggestions? Thanks.

EDIT: Changing from input_.get_shape() to tf.shape(input_) solved the problem of ambiguous shape at graph creation. Now I get ValueError: Shape (4,) must have rank 2


Solution

  • I hope this helps anyone else trying to build an inception module with varying output sizes.

    def depthconcat(inputs):
        concat_dim = 3
        shapes = []
        for input_ in inputs:
            shapes.append(tf.to_float(tf.shape(input_)[:3]))
        shape_tensor = tf.pack(shapes)
        max_dims = tf.reduce_max(shape_tensor, 0)
    
        padded_inputs = []
        for idx, input_ in enumerate(inputs):
            mean_diff = (max_dims - shapes[idx])/2.0
            pad_low = tf.floor(mean_diff)
            pad_high = tf.ceil(mean_diff)
            paddings = tf.to_int32(tf.pack([pad_low, pad_high], axis=1))
            paddings = tf.pad(paddings, paddings=[[0, 1], [0, 0]])
            padded_inputs.append(tf.pad(input_, paddings))
    
         return tf.concat(concat_dim, padded_inputs, name=name)