Search code examples
deep-learningpytorch

What does out_channels in Conv2d represent?


import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

I am taking a look at PyTorch Blitz and in the conv1 layer we can see the input_channels=3 because it's the first image so it just has its 3 RGB channels and out_channels=6.

Does that mean the number of filters I have are 6? In which case it'd mean the total number of feature maps I would get are 6*3==18? But if that is the case why in conv2 am I plugging in input_channels=6, shouldn't I be plugging in 18 because that was the output from the previous Convolutional layer?


Solution

  • No, you get the number of out_channels (that IS the number of feature maps). Imagine that you start off with 3 representations of your data - i.e., your input channels - you can choose how many representations you want after the convolutional operation, in this case you specified 6. Naturally, for the subsequent layer, you'll have 6 input channels.

    It sounds like you don't fully grasp how convolutional neural networks work internally, look at some explanations like here.

    If you apply your case to the picture from that website, you'd have 6 different filters, leading to 6 output channels. The number you pick, is up to you! enter image description here