Search code examples
pythonpytorchpaddingtruncate

How to pad a set of tensors to a specific height


I am doing action recognition with mediapipe keypoints. These are the shapes of some of my tensors:

torch.Size([3, 3, 75]) torch.Size([3, 6, 75]) torch.Size([3, 10, 75]) torch.Size([3, 11, 75]) torch.Size([3, 9, 75]) torch.Size([3, 4, 75]) torch.Size([3, 21, 75])

The height of each tensor varies as they refer to the number of frames for each sample.

I have decided that I want to consider 8 frames for each sample. I understand I have to do padding and truncate (for heights above 8), but somehow just doing the padding worked, or so it seems. I wish to understand how my code worked.

if height < 8:
            source_pad = F.pad(tensor1, pad=(0, 0, 0, 8 - height))
        else:
            source_pad = F.pad(tensor1, pad=(0,0, 0, 8 - height))

Solution

  • Per documentation the pad arguments are specified working backwards from last dimension, so you pad the last dimension by 1 on each side and the second-to-last dimension by 0 at the start and 8-height at the end. 8-height works out to be positive if height is less than 8, 0 if height = 8, and negative if height is greater than 8.

    In other words, resulting height = height + 0 + (8-height) = 8.