I'm applying yolov5 on kitti raw image [C, H, W] = [3, 375, 1242]. Therefore I need to pad the image so that the H and W being dividable by 32. I'm using nn.ReplicationPad2d
to do the padding: [3, 375, 1242] -> [3, 384, 1248].
In the official tutorial of nn.ReplicationPad2d
it was said that we give a 4-tuple to indicate padding sizes for left, right, top and bottom.
The Problem is:
When I give a 4-tuple (0, pad1, 0, pad2), it claims that: 3D tensors expect 2 values for padding
When I give a 2-tuple (pad1, pad2), the pad can be implemented but it seems that only W was padded by pad1+pad2, while H stays unchanged. Because I 'll get a tensor of size [3, 375, 1257].
1257-1242 = 15 = 9+6, where 9 was supposed to pad H and 6 pad W.
I could not figure out what is the problem here...
thanks in advance
Here is my code:
def paddingImage(img, divider=32):
if img.shape[1]%divider != 0 or img.shape[2]%divider != 0:
padding1_mult = int(img.shape[1] / divider) + 1
padding2_mult = int(img.shape[2] / divider) + 1
pad1 = (divider * padding1_mult) - img.shape[1]
pad2 = (divider * padding2_mult) - img.shape[2]
# pad1 = 32 - (img.shape[1]%32)
# pad2 = 32 - (img.shape[2]%32)
# pad1 = 384 - 375 # 9
# pad2 = 1248 - 1242 # 6
#################### PROBLEM ####################
padding = nn.ReplicationPad2d((pad1, pad2))
#################### PROBLEM ####################
return padding(img)
else:
return img
Where img
was given as a torch.Tensor
in the main function:
# ...
image_tensor = torch.from_numpy(image_np).type(torch.float32)
image_tensor = paddingImage(image_tensor)
image_np = image_tensor.numpy()
# ...
PyTorch expects the input to ReplicationPad2d to be batched image tensors. Therefore, we can unsqueeze to add a 'batch dimension'.
def paddingImage(img, divider=32):
if img.shape[1]%divider != 0 or img.shape[2]%divider != 0:
padding1_mult = int(img.shape[1] / divider) + 1
padding2_mult = int(img.shape[2] / divider) + 1
pad1 = (divider * padding1_mult) - img.shape[1]
pad2 = (divider * padding2_mult) - img.shape[2]
# pad1 = 32 - (img.shape[1]%32)
# pad2 = 32 - (img.shape[2]%32)
# pad1 = 384 - 375 # 9
# pad2 = 1248 - 1242 # 6
padding = nn.ReplicationPad2d((0, pad2, 0, pad1))
# Add a extra batch-dimension, pad, and then remove batch-dimension
return torch.squeeze(padding(torch.unsqueeze(img,0)),0)
else:
return img
Hope this helps!
EDIT As GoodDeeds mentions, this is resolved with later versions of PyTorch. Either upgrade PyTorch, or, if that's not an option, use the code above.