Search code examples
machine-learningdeep-learningpytorchneural-networklinear-regression

PyTorch matrix multiplication shape error: "RuntimeError: mat1 and mat2 shapes cannot be multiplied"


I'm new to PyTorch and creating a multi-output linear regression model to color words based on their letters. (This will help people with grapheme-color synesthesia have an easier time reading.) It takes in words and outputs RGB values. Each word is represented as a vector of 45 floats [0,1], where (0, 1] represents letters and 0 represents that no letter exists in that place. The output for each sample should be a vector [r-value, g-value, b-value].

I'm getting

RuntimeError: mat1 and mat2 shapes cannot be multiplied (90x1 and 45x3)

when I try to run my model in the training loop.

Looking at extant Stack Overflow posts, I think this means that I need to reshape my data, but I don't know how/where to do so in a way that would solve this problem. Especially considering that I don't know where that 90x1 matrix came from.

My Model

I started simple; multiple layers can come after I can get a single layer to function.

class ColorPredictor(torch.nn.Module):
    #Constructor
    def __init__(self):
        super(ColorPredictor, self).__init__()
        self.linear = torch.nn.Linear(45, 3, device= device) #length of encoded word vectors & size of r,g,b vectors
        
    # Prediction
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        y_pred = self.linear(x)
        return y_pred

How I'm loading my data

# Dataset Class
class Data(Dataset):
    # Constructor
    def __init__(self, inputs, outputs):
        self.x = inputs # a list of encoded word vectors
        self.y = outputs # a Pandas dataframe of r,g,b values converted to a torch tensor
        self.len = len(inputs)
    
    # Getter
    def __getitem__(self, index):
        return self.x[index], self.y[index]
    
    # Get number of samples
    def __len__(self):
        return self.len
# create train/test split
train_size = int(0.8 * len(data))
train_data = Data(inputs[:train_size], outputs[:train_size])
test_data = Data(inputs[train_size:], outputs[train_size:])
# create DataLoaders for training and testing sets
train_loader = DataLoader(dataset = train_data, batch_size=2)
test_loader = DataLoader(dataset = test_data, batch_size=2)

The testing loop, where the error occurs

for epoch in range(epochs):
    # Train
    model.train() #training mode
    for x,y in train_loader:
        y_pred = model(x) #ERROR HERE
        loss = criterion(y_pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
      

Error Traceback enter image description here enter image description here

New Attempt:

Changed the 45x1 input tensor to a 2x45 input tensor, with the second column being all zeros. This works for the first run through the train_loader loop, but during the second run through the train_loader loop I get another matrix multiplication error, this time for matrices of sizes 90x2 and 45x3.


Solution

  • I reshaped the encoded word vector from (45, 1) to (1,45)

    if input size is (1,45) and batch_size = 2:

    size of weight matrix = output_features x input_features = 3x45
    
    bias vector size = output_features = 3
    
    
             input x       weight transposed            bias
    y = [ [1,2,3,...,45],  * [ [1, 2, 3],     +    [ [b1, b2, b3],
          [3,2,1,...,45]]      [2, 2, 3],            [b1, b2, b3] ]
                               [3, 2, 3],
                               [.  .  .],
                               [45,45,45] ]
    
             2x45       *        45x3
                      
                       2x3                   +           2x3