python tensorflow neural-network artificial-intelligence pytorch

Beginner PyTorch : RuntimeError: size mismatch, m1: [16 x 2304000], m2: [600 x 120]

I'm a beginner with PyTorch and building NNs in general and I'm kinda stuck.

I have this CNN architecture:

class ConvNet(nn.Module):

    def __init__(self, num_classes=10):
        super(ConvNet, self).__init__()

        self.conv1 = nn.Conv2d(
            in_channels=3, 
            out_channels=16, 
            kernel_size=3)

        self.conv2 = nn.Conv2d(
            in_channels=16, 
            out_channels=24, 
            kernel_size=4)

        self.conv3 = nn.Conv2d(
            in_channels=24, 
            out_channels=32, 
            kernel_size=4)

        self.dropout = nn.Dropout2d(p=0.3)

        self.pool = nn.MaxPool2d(2)

        self.fc1 = nn.Linear(600, 120)
        self.fc2 = nn.Linear(512, 10)

        self.final = nn.Softmax(dim=1)

    def forward(self, x):

        # conv 3 layers

        x = F.max_pool2d(F.relu(self.conv1(x)), 2)  # output of conv layers
        x = self.dropout(x)

        x = F.max_pool2d(F.relu(self.conv2(x)), 2)  # output of conv layers
        x = self.dropout(x)

        x = F.max_pool2d(F.relu(self.conv3(x)), 2)  # output of conv layers
        x = self.dropout(x)

        # linear layer

        x = F.interpolate(x, size=(600, 120))
        x = x.view(x.size(0), -1)

        x = self.fc1(x) 

        return x

But when I try to train with my images, it doesn't work and I have this error:

RuntimeError: size mismatch, m1: [16 x 2304000], m2: [600 x 120]

I would like to add a second linear layer (self.fc2) as well as a final SoftMax layer (self.final) but since I'm stuck at the first linear layer I cannot make any progress.

Solution

The input dimension of self.fc1 needs to match the feature (second) dimension of your flattened tensor. So instead of doing self.fc1 = nn.Linear(600, 120), you can replace this with self.fc1 = nn.Linear(2304000, 120).

Keep in mind that because you are using fully-connected layers, the model cannot be input size invariant (unlike Fully-Convolutional Networks). If you change the size of the channel or spatial dimensions before x = x.view(x.size(0), -1) (like you did moving from the last question to this one), the input dimension of self.fc1 will have to change accordingly.