ERROR in CNN Pytorch; shape '[-1, 192]' is invalid for input of size 300000

I want to change kernal size to 3, output channels of convolutional layers to 8 and 16 respectively. But when i try to change it i get an error message The following code is working fine but when I change kernal size and output channels like this:

    self.conv1 = nn.Conv2d(in_channels=1,out_channels=**8**,kernel_size=**3**)
    self.conv2 = nn.Conv2d(in_channels=**8**,out_channels=**16**,kernel_size=**3**)
    self.fc1 = nn.Linear(in_features=**16*2*2**,out_features=128)

It generate an error for invalid input size.

working code




class Network(nn.Module):
  def __init__(self):
    super(Network,self).__init__()
    self.conv1 = nn.Conv2d(in_channels=1,out_channels=6,kernel_size=5)
    self.conv2 = nn.Conv2d(in_channels=6,out_channels=12,kernel_size=5)
    self.fc1 = nn.Linear(in_features=12*4*4,out_features=128)
    self.fc2 = nn.Linear(in_features=128,out_features=64)
    self.out = nn.Linear(in_features=64,out_features=10)
  def forward(self,x):
    #input layer
    x = x
    #first hidden layer
    x = self.conv1(x)
    x = F.relu(x)
    x = F.max_pool2d(x,kernel_size=2,stride=2)
    #second hidden layer
    x = self.conv2(x)
    x = F.relu(x)
    x = F.max_pool2d(x,kernel_size=2,stride=2)
    #third hidden layer
    x = x.reshape(-1,12*4*4)
    x = self.fc1(x)
    x = F.relu(x)
    #fourth hidden layer
    x = self.fc2(x)
    x = F.relu(x)
    
    #output layer
    x = self.out(x)
    return x


batch_size = 1000
train_dataset = FashionMNIST(
    '../data', train=True, download=True, 
    transform=transforms.ToTensor())
trainloader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

test_dataset = FashionMNIST(
    '../data', train=False, download=True, 
    transform=transforms.ToTensor())
testloader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=True)

model = Network()

losses = []
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())
epochs = 1

for i in range(epochs):
    batch_loss = []
    for j, (data, targets) in enumerate(trainloader):
        optimizer.zero_grad()
        ypred = model(data)
        loss = criterion(ypred, targets.reshape(-1))
        loss.backward()
        optimizer.step()
        batch_loss.append(loss.item())
    if i>10: 
        optimizer.lr = 0.0005
    losses .append(sum(batch_loss) / len(batch_loss))
    print('Epoch {}:\tloss {:.4f}'.format(i, losses [-1]))

Solution

By changing your kernel size and output size in intermediate filters, you also change the size of your intermediate activations.

I suppose your input data is of size (1,28,28) (the usual size for FashionMNIST). In your original code, before the layer self.fc1, after two 2D convolutionnal layers and two maxpools, the shape of your activations will be (12, 4, 4). However, if you change your kernel size to 3 and output channels of convolutional layers to 8 and 16, this shape will change. It will now be (16, 5, 5). Thus, you have to change your network. Try the following:

class Network(nn.Module):
    def __init__(self):
        super(Network,self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1,out_channels=8,kernel_size=3)
        self.conv2 = nn.Conv2d(in_channels=8,out_channels=16,kernel_size=3)
        self.fc1 = nn.Linear(in_features=16*5*5,out_features=128)
        self.fc2 = nn.Linear(in_features=128,out_features=64)
        self.out = nn.Linear(in_features=64,out_features=10)
     
    def forward(self,x):
        #input layer
        x = x
     
        #first hidden layer
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x,kernel_size=2,stride=2)
     
        #second hidden layer
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x,kernel_size=2,stride=2)
        
        #third hidden layer
        x = x.reshape(-1,16*5*5)
        x = self.fc1(x)
        x = F.relu(x)
    
        #fourth hidden layer
        x = self.fc2(x)
        x = F.relu(x)

        #output layer
        x = self.out(x)
        return x

Check Pytorch's documentation for the Conv2D and Maxpool layers. The output size after a Conv2D layer is:

H_out = ⌊[H_in + 2*padding[0] − dilation[0]*(kernel_size[0]−1)−1]/stride[0] +1⌋

W_out = ⌊[W_in + 2*padding[1] - dilation[1]*(kernel_size[1]-1)-1]/stride[1] +1⌋

As you use the default values, the output size after the first convolutionnal layer will be :

H_out = W_out = 28+0-2-1+1=26

The maxpool following will divide this size by 2, and after the second convolutionnal layer the size will be:

13+0-2-1+1=11

The second maxpool will divide this by 2 again, taking the floor value, which is 5. Thus, the output shape after the second layer will be (n, 16, 5, 5). Before the first fully connected layer, this has to be flattened. This is why the input features of self.fc1 is 16*5*5.