Search code examples
machine-learningconv-neural-networkpytorchtorch

What's different between convolution layer in `Torch`(i.e `nn.SpatialConvolution`) and convolution layer in `Pytorch`(i.e `torch.nn.Conv2d`)


I would like to know the difference between convolution layer in Torch(i.e nn.SpatialConvolution) and convolution layer in Pytorch(i.e torch.nn.Conv2d)

In Torch's Docs, I found the output shape of SpatialConvolution

It says "If the input image is a 3D tensor nInputPlane x height x width, the output image size will be nOutputPlane x oheight x owidth where

owidth  = floor((width  + 2*padW - kW) / dW + 1)
oheight = floor((height + 2*padH - kH) / dH + 1)

"

which is different from torch.nn.Conv2d's in Pytorch Docs.

output shape of Conv2d in Pytorch

Does it mean they are different operation?


Solution

  • Yes, they are different as torch does not have dilation parameter (for dilation explanation see here, basically the kernel has "spaces" between each kernel element width and height wise and this is what slides over the image).

    Except for dilation both equations are the same (set dilation to one in pytorch's version and it's equal).

    If you want to use dilation in torch there is a separate class for that called nn.SpatialDilatedConvolution.