Search code examples
machine-learningpytorchconv-neural-network

Pytorch matrix multiplication error (shape mismatch)


I have implemented a class from pytorch which looks like

class GenderClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(
            nn.Conv2d(3,32,(3,3)),
            nn.ReLU(),
            nn.Conv2d(32,64,(3,3)),
            nn.ReLU(),
            nn.Conv2d(64,64,(3,3)),
            nn.ReLU(),
            nn.Flatten(),
            nn.Linear(64*104*74,2),
            nn.Sigmoid()
        )
        
    def forward(self,x):
        return self.model(x)

I'm getting this error when I'm training it RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x7696 and 492544x2). The input shape is 110*80 size image with 3 channels (3,110,80).

The training code looks like

for epoch in range(30): # train for 10 epochs    
    for batch in dataset: 
        
        X,y = batch 
        X, y = X.to('cuda'), y 
        yhat = clf(X) 
        loss = loss_fn(yhat, y) 

        # Apply backprop 
        opt.zero_grad()
        loss.backward() 
        opt.step() 

    print(f"Epoch:{epoch} loss is {loss.item()}")
with open("model.pt","wb") as f:
    save(clf.state_dict(),f)```

Solution

  • Two plausible solutions:

    1. The error message references the computing of the input batch size. It says 64x7696, meaning it interprets the input as a batch of 64 items each of size 7696. So, with that being said, your data inputs of size (batch_size, 3, 110, 80).

    To debug, print X var before passing it to the model:

    print(X.shape)
    

    Check and validate batch size, as well as image augmentation or preprocessing is changing it.

    1. The other potential mismatch between the output shape of the last convolutional layer and the input shape of the fully connected layer. The error message is about multiplying a matrix of size (64x7696) with another matrix of size (492544x2), which isnt valid. The main change is the linear layers: I've added an additional linear layer with an output size of 128 units after the first one. You can adjust the output size of this linear layer based on your experimentations. The last linear layer outputs 2 units without an activation function since you're using the sigmoid activation in the loss function.

       class GenderClassifier(nn.Module): 
          def __init__(self): 
             super().__init__() 
             self.model = nn.Sequential(
                 nn.Conv2d(3, 32, (3, 3)), 
                 nn.ReLU(), 
                 nn.Conv2d(32, 64, (3, 3)), 
                 nn.ReLU(), 
                 nn.Conv2d(64, 64, (3, 3)), 
                 nn.ReLU(), 
                 nn.Flatten(), 
                 nn.Linear(64 * 104 * 74, 128), 
                 nn.ReLU(), nn.Linear(128, 2),
                )
      
           def forward(self, x):
              return self.model(x)