Search code examples
pythonpytorchconv-neural-network

Upsampling the Spatial Dimensions Of a 4D Tensor using transposed convolution


I have a tensor with the size of (16, 64, 4,4) and I want to upsample the spatial size of this tensor using transposed convolution. How can I select the kernel size, stride, padding to have a tensor with size of (16, 64, 4,6)?

For example this is my code for upsampling from (16, 64, 1,1) to (16, 64, 4,4):

nn.convTranspose2d(64,64,kernel_size=6,stride=4,padding=1)

Solution

  • The equation for computing the output size is in the pytorch documentation of ConvTranspose2d. Put in your input size, set your desired output size and solve the equation for kernel size, stride, padding, etc. There may be multiple valid solutions.

    For your specific problem, the following works:

    from torch import nn
    
    t = torch.ones((16, 64, 4, 4))  # test data
    layer = nn.ConvTranspose2d(64, 64, kernel_size=(1, 3))
    print(layer(t).shape)  # torch.Size([16, 64, 4, 6])
    

    To demonstrate that there can be multiple valid solutions, here is another solution for up-sampling a tensor with shape (16, 64, 1, 1) to (16, 64, 4, 4).

    from torch import nn
    
    t = torch.ones((16, 64, 1, 1))  # test data
    layer = nn.ConvTranspose2d(64, 64, kernel_size=(4, 4))
    print(layer(t).shape)  # torch.Size([16, 64, 4, 4])