Search code examples
pythonpytorchconv-neural-networkbatch-normalization

How to resolve the error "mat1 and mat2 shapes cannot be multiplied" for the following CNN architecture?


I am trying to implement a Conv1d model with Batch Normalization but I am getting the error :

RuntimeError                              Traceback (most recent call last)
<ipython-input-117-ef6e122ea50c> in <module>()
----> 1 test()
      2 for epoch in range(1, n_epochs + 1):
      3   train(epoch)
      4   test()

7 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
   1751     if has_torch_function_variadic(input, weight):
   1752         return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
-> 1753     return torch._C._nn.linear(input, weight, bias)
   1754 
   1755 

RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x140 and 100x10)

I use a batch size of 32, with the number of features of the data being 40. I've been trying to calculate where the 32 x 140 is coming from but I wasn't able to do that. Here is the architecture for the CNN I am trying to use:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        #self.flatten=nn.Flatten()
        self.net_stack=nn.Sequential(
            nn.Conv1d(in_channels=1, out_channels=25, kernel_size=5, stride=2), #applying batch norm
            nn.ReLU(),
            nn.BatchNorm1d(25, affine=True),
            nn.Conv1d(in_channels=25, out_channels=20, kernel_size=5, stride=2), #applying batch norm
            nn.ReLU(),
            nn.BatchNorm1d(20, affine=True),
            nn.Flatten()
            nn.Linear(20*5, 10),
            nn.Softmax(dim=1))

    def forward(self,x):
        # result=self.net_stack(x[None])
        result=self.net_stack(x[:, None, :])
        return result

Solution

  • This fully-connnected should change from:

    nn.Linear(20*5, 10)
    

    to:

    nn.Linear(20*7, 10)
    

    Why?

    If your input data length is 40, then (B is the batch size):

    • Output after first conv (K=25): B x 25 x 18
    • Output after second conv (K=20): B x 20 x 7
    • Output after nn.Flatten(): B x 140, i.e., if B=32, then 32x140