Search code examples
pythondeep-learningpytorchconv-neural-network

RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x400 and 600x120)


I have the following CNN below and I am getting the following error RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x400 and 600x120). Im using the CIFAR10 dataset which contains a total of 6,000 32x32 images with 10 labels. If I understand correctly, the size input into x = F.relu(self.fc1(x)) should be 600x200 but the input is actually 32x400. Where I am lost is which portion I need to change (or calculate).

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

batch_size = 32

cifar10 = torchvision.datasets.CIFAR10(root='./data', download=True, transform=torchvision.transforms.ToTensor())
pivot = 40000
cifar10 = sorted(cifar10, key=lambda x: x[1])
train_set = torch.utils.data.Subset(cifar10, range(pivot))
val_set = torch.utils.data.Subset(cifar10, range(pivot, len(cifar10)))
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_set, batch_size=batch_size, shuffle=True)

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(600, 120)
        self.fc2 = nn.Linear(120, 2)
        self.fc3 = nn.Linear(2, 10)
        self.flatten = nn.Flatten(1)
    
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.flatten(x)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = Network()

I attempted solutions from other posts with similar error but wasn't able to fix my code. I also attempted adding torch.nn.AdaptiveMaxPool2d but I don't think I used that correctly and not sure if I actually need to use that.


Solution

  • It's better to use "same" padding convolution 2d if downsampling is performed by pooling.

    class Network(nn.Module):
        def __init__(self):
            super(Network, self).__init__()
            self.conv1 = nn.Conv2d(3, 6, kernel_size=5, padding=2)
            self.pool = nn.MaxPool2d(2, 2) # downsample / 2
            self.conv2 = nn.Conv2d(6, 16, kernel_size=5, padding=2)
            self.fc1 = nn.Linear(8*8*16, 120)
            self.fc2 = nn.Linear(120, 2)
            self.fc3 = nn.Linear(2, 10)
            self.flatten = nn.Flatten(1)
        
        def forward(self, x):
            x = self.pool(F.relu(self.conv1(x)))
            x = self.pool(F.relu(self.conv2(x)))
            x = self.flatten(x)
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x