RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1

I am training a CNN model. I am facing issue while doing the training iteration for my model. The code is as below:

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()

        #convo layers
        self.conv1 = nn.Conv2d(3,32,3)
        self.conv2 = nn.Conv2d(32,64,3)
        self.conv3 = nn.Conv2d(64,128,3)
        self.conv4 = nn.Conv2d(128,256,3)
        self.conv5 = nn.Conv2d(256,512,3)

        #pooling layer
        self.pool = nn.MaxPool2d(2,2)

        #linear layers
        self.fc1 = nn.Linear(512*5*5,2048)
        self.fc2 = nn.Linear(2048,1024)
        self.fc3 = nn.Linear(1024,133)

        #dropout layer
        self.dropout = nn.Dropout(0.3)
        def forward(self, x):
        #first layer
        x = self.conv1(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #second layer
        x = self.conv2(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #third layer
        x = self.conv3(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)
        #fourth layer
        x = self.conv4(x)
        x = F.relu(x)
        x = self.pool(x)
        #fifth layer
        x = self.conv5(x)
        x = F.relu(x)
        x = self.pool(x)
        #x = self.dropout(x)

        #reshape tensor
        x = x.view(-1,512*5*5)
        #last layer
        x = self.dropout(x)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = F.relu(x)
        x = self.fc3(x)

        return x

        #loss func
        criterion = nn.MSELoss()
        optimizer = optim.Adam(net.parameters(), lr = 0.0001)
        #criterion = nn.CrossEntropyLoss()
        #optimizer = optim.SGD(net.parameters(), lr = 0.05)

        def train(n_epochs,model,loader,optimizer,criterion,save_path):    
           for epoch in range(n_epochs):
              train_loss = 0
              valid_loss = 0
              #training 
              net.train()
              for batch, (data,target) in enumerate(loaders['train']):
                   optimizer.zero_grad()
                   outputs = net(data)
                   #print(outputs.shape)
                   loss = criterion(outputs,target)
                   loss.backward()
                   optimizer.step()

When I use the CrossEntropy Loss function and SGD optimizer, I able able to train the model with no error. When I use MSE loss function and Adam optimizer, I am facing the following error:

RuntimeError Traceback (most recent call last) <ipython-input-20-2223dd9058dd> in <module>
      1 #train the model
      2 n_epochs = 2
----> 3 train(n_epochs,net,loaders,optimizer,criterion,'saved_model/dog_model.pt')

<ipython-input-19-a93d145ef9f7> in train(n_epochs, model, loader, optimizer, criterion, save_path)
     22 
     23             #calculate loss
---> 24             loss = criterion(outputs,target)
     25 
     26             #backward prop

RuntimeError: The size of tensor a (133) must match the size of tensor b (10) at non-singleton dimension 1.

Does the selected loss function and optimizer effect the training of the model? Can anyone please help on this?

Solution

The error message clearly suggests that the error occurred at the line

loss = criterion(outputs,target)

where you are trying to compute the mean-squared error between the input and the target. See this line: criterion = nn.MSELoss().

I think you should modify your code where you are estimating loss between (output, target) pair of inputs,i.e., loss = criterion(outputs,target) to something like below:

loss = criterion(outputs,target.view(1, -1))

Here, you are making target shape same as outputs from model on line

outputs = net(data)

One more think to notice here is the output of the net model, i.e., outputs will be of shape batch_size X output_channels, where batch size if the first dimension of input images as during the training you will get batches of images, so your shape in the forward method will get an additional batch dimension at dim0: [batch_size, channels, height, width], and ouput_channels is number of output features/channels from the last linear layer in the net model.

And, the the target labels will be of shape batch_size, which is 10 in your case, check batch_size you passed in torch.utils.data.DataLoader(). Therefore, on reshaping it using view(1, -1), it will be of converted into a shape 1 X batch_size, i.e., 1 X 10.

That's why, you are getting the error:

RuntimeError: input and target shapes do not match: input [10 x 133], target [1 x 10]

So, a way around is to replace loss = criterion(outputs,target.view(1, -1)) with loss = criterion(outputs,target.view(-1, 1)) and change the output_channels of last linear layer to 1 instead of 133. In this way, both of outputs and target shape will be equal and we can compute MSE value then.

Learn more about pytorch MSE loss function from here.