I am running a model which input has the dimension of 1x24x24x8. (1 is the input channel) the input will go through convolution layer coded as follows:
nn.Sequential(OrderedDict([
(name+'conv1',nn.Conv3d(in_channels=1,out_channels=8,kernel_size=3,padding=1, bias=False)),
(name+'bnorm1',nn.BatchNorm3d(num_features=8)),
(name+'relu1',nn.ReLU(inplace=True)),
(name+'conv2',nn.Conv3d(in_channels=8, out_channels=8,kernel_size=3,padding=1, bias=False)),
(name+'bnorm2',nn.BatchNorm3d(num_features=8)),
(name+'relu2',nn.ReLU(inplace=True))]))
so after 'conv1'
, input's dimension will change to 8x24x24x8 (8 is the output channel).
kernel_size
is 3.
What I can understand is that:
We have 8 filters, each having the dimension of 1x3x3x8 or each kernel has the dimension of 3x3x8. Here I am almost sure about the filter dimension, because the kernel cube will cover image cube and we will repeat it for 8 layers.
But what if after max pooling it enters into another convolution layer:
then the input would have the size of 8x12x12x4 (8 is the input channel)
nn.Sequential(OrderedDict([
(name+'conv1',nn.Conv3d(in_channels=8,out_channels= 8*2,kernel_size=3,padding=1, bias=False)),
(name+'bnorm1',nn.BatchNorm3d(num_features=8*2)),
(name+'relu1',nn.ReLU(inplace=True)),
(name+'conv2',nn.Conv3d(in_channels= 8*2, out_channels=8*2, kernel_size=3,padding=1, bias=False)),
(name+'bnorm2',nn.BatchNorm3d(num_features=8*2)),
(name+'relu2',nn.ReLU(inplace=True))]
))
So after 'conv1', input's dimension will change to 16x12x12x4 (16 is the output channel).
kernel_size is 3
Here, what is the number of filters? Is it correct to say that we have 16 filters? What would be the dimension of each filter/kernel?
If we say that each filter has the dimension of 8x3x3x4, then we should have 2 filters to reach the final dimension of 16x3x3x8, is it right?
If we say each filter has again the dimension of 1x3x3x4 and we have 16 of them, then how is it moving over the cube of 8x12x12x4 to generate 16x12x12x4?
Just look at the shape of the layer weights
layer = nn.Conv3d(8, 16, kernel_size=3)
print(layer.weight.shape)
>torch.Size([16, 8, 3, 3, 3])
# 16 filters of size (8,3,3,3)