I would like to know the difference between convolution layer in Torch
(i.e nn.SpatialConvolution
) and convolution layer in Pytorch
(i.e torch.nn.Conv2d
)
In Torch
's Docs, I found the output shape of SpatialConvolution
It says "If the input image is a 3D tensor nInputPlane x height x width
, the output image size will be nOutputPlane x oheight x owidth
where
owidth = floor((width + 2*padW - kW) / dW + 1)
oheight = floor((height + 2*padH - kH) / dH + 1)
"
which is different from torch.nn.Conv2d
's in Pytorch Docs.
Does it mean they are different operation?
Yes, they are different as torch
does not have dilation
parameter (for dilation explanation see here, basically the kernel has "spaces" between each kernel element width and height wise and this is what slides over the image).
Except for dilation
both equations are the same (set dilation
to one in pytorch
's version and it's equal).
If you want to use dilation
in torch
there is a separate class for that called nn.SpatialDilatedConvolution
.