Search code examples
pytorchconv-neural-networkconvolutiondeconvolution

What output_padding does in nn.ConvTranspose2d?


What is the working of Output_padding in Conv2dTranspose? Please Help me to understand this?

Conv2dTranspose(1024, 512, kernel_size=3, stride=2, padding=1, output_padding=1)

Solution

  • According to documentation here: https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html when applying Conv2D operation with Stride > 1 you can get same output dimensions with different inputs. For example, 7x7 and 8x8 inputs would both return 3x3 output with Stride=2:

    import torch
    
    conv_inp1 = torch.rand(1,1,7,7)
    conv_inp2 = torch.rand(1,1,8,8)
    
    conv1 = torch.nn.Conv2d(1, 1, kernel_size = 3, stride = 2)
    
    out1 = conv1(conv_inp1)     
    out2 = conv1(conv_inp2)
    print(out1.shape)         # torch.Size([1, 1, 3, 3])
    print(out2.shape)         # torch.Size([1, 1, 3, 3])
    

    And when applying the transpose convolution, it is ambiguous that which output shape to return, 7x7 or 8x8 for stride=2 transpose convolution. Output padding helps pytorch to determine 7x7 or 8x8 output with output_padding parameter. Note that, it doesn't pad zeros or anything to output, it is just a way to determine the output shape and apply transpose convolution accordingly.

    conv_t1 = torch.nn.ConvTranspose2d(1, 1, kernel_size=3, stride=2)
    conv_t2 = torch.nn.ConvTranspose2d(1, 1, kernel_size=3, stride=2, output_padding=1)
    transposed1 = conv_t1(out1)
    transposed2 = conv_t2(out2)
    
    print(transposed1.shape)      # torch.Size([1, 1, 7, 7])
    print(transposed2.shape)      # torch.Size([1, 1, 8, 8])