Search code examples
pythonpytorchpaddingtorch

Why couldn't I feed a 4-tuple to nn.ReplicationPad2d()?


I'm applying yolov5 on kitti raw image [C, H, W] = [3, 375, 1242]. Therefore I need to pad the image so that the H and W being dividable by 32. I'm using nn.ReplicationPad2d to do the padding: [3, 375, 1242] -> [3, 384, 1248].
In the official tutorial of nn.ReplicationPad2d it was said that we give a 4-tuple to indicate padding sizes for left, right, top and bottom.
The Problem is:
When I give a 4-tuple (0, pad1, 0, pad2), it claims that: 3D tensors expect 2 values for padding
When I give a 2-tuple (pad1, pad2), the pad can be implemented but it seems that only W was padded by pad1+pad2, while H stays unchanged. Because I 'll get a tensor of size [3, 375, 1257].
1257-1242 = 15 = 9+6, where 9 was supposed to pad H and 6 pad W.
I could not figure out what is the problem here...
thanks in advance

Here is my code:

def paddingImage(img, divider=32):
    if img.shape[1]%divider != 0 or img.shape[2]%divider != 0:
        padding1_mult = int(img.shape[1] / divider) + 1
        padding2_mult = int(img.shape[2] / divider) + 1
        pad1 = (divider * padding1_mult) - img.shape[1]
        pad2 = (divider * padding2_mult) - img.shape[2]

        # pad1 = 32 - (img.shape[1]%32)
        # pad2 = 32 - (img.shape[2]%32)
        # pad1 = 384 - 375    # 9
        # pad2 = 1248 - 1242  # 6

        #################### PROBLEM ####################
        padding = nn.ReplicationPad2d((pad1, pad2))
        #################### PROBLEM ####################

        return padding(img)
    else:
        return img

Where img was given as a torch.Tensor in the main function:

# ...
image_tensor = torch.from_numpy(image_np).type(torch.float32)
image_tensor = paddingImage(image_tensor)
image_np = image_tensor.numpy()
# ...

Solution

  • PyTorch expects the input to ReplicationPad2d to be batched image tensors. Therefore, we can unsqueeze to add a 'batch dimension'.

    def paddingImage(img, divider=32):
        if img.shape[1]%divider != 0 or img.shape[2]%divider != 0:
            padding1_mult = int(img.shape[1] / divider) + 1
            padding2_mult = int(img.shape[2] / divider) + 1
            pad1 = (divider * padding1_mult) - img.shape[1]
            pad2 = (divider * padding2_mult) - img.shape[2]
    
            # pad1 = 32 - (img.shape[1]%32)
            # pad2 = 32 - (img.shape[2]%32)
            # pad1 = 384 - 375    # 9
            # pad2 = 1248 - 1242  # 6
    
            padding = nn.ReplicationPad2d((0, pad2, 0, pad1))
            # Add a extra batch-dimension, pad, and then remove batch-dimension
            return torch.squeeze(padding(torch.unsqueeze(img,0)),0)
        else:
            return img
    

    Hope this helps!

    EDIT As GoodDeeds mentions, this is resolved with later versions of PyTorch. Either upgrade PyTorch, or, if that's not an option, use the code above.