pytorch conv-neural-network convolution deconvolution

What output_padding does in nn.ConvTranspose2d?

What is the working of Output_padding in Conv2dTranspose? Please Help me to understand this?

Conv2dTranspose(1024, 512, kernel_size=3, stride=2, padding=1, output_padding=1)

Solution

According to documentation here: https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html when applying Conv2D operation with Stride > 1 you can get same output dimensions with different inputs. For example, 7x7 and 8x8 inputs would both return 3x3 output with Stride=2:

import torch

conv_inp1 = torch.rand(1,1,7,7)
conv_inp2 = torch.rand(1,1,8,8)

conv1 = torch.nn.Conv2d(1, 1, kernel_size = 3, stride = 2)

out1 = conv1(conv_inp1)     
out2 = conv1(conv_inp2)
print(out1.shape)         # torch.Size([1, 1, 3, 3])
print(out2.shape)         # torch.Size([1, 1, 3, 3])

And when applying the transpose convolution, it is ambiguous that which output shape to return, 7x7 or 8x8 for stride=2 transpose convolution. Output padding helps pytorch to determine 7x7 or 8x8 output with output_padding parameter. Note that, it doesn't pad zeros or anything to output, it is just a way to determine the output shape and apply transpose convolution accordingly.

conv_t1 = torch.nn.ConvTranspose2d(1, 1, kernel_size=3, stride=2)
conv_t2 = torch.nn.ConvTranspose2d(1, 1, kernel_size=3, stride=2, output_padding=1)
transposed1 = conv_t1(out1)
transposed2 = conv_t2(out2)

print(transposed1.shape)      # torch.Size([1, 1, 7, 7])
print(transposed2.shape)      # torch.Size([1, 1, 8, 8])