Search code examples
pythonnumpypytorchtensormatmul

Is there any difference between `matmul()` and usual multiplication of tensors?


I am confused between the multiplication between two tensors using * and matmul().

Below is my code:

import torch
torch.manual_seed(7)
features = torch.randn((2, 5))
weights = torch.randn_like(features)

Here, I want to multiply weights and features. so, one way to do it is as follows:

print(torch.sum(features * weights))

Output:

tensor(-2.6123)

Another way to do is using matmul():

print(torch.matmul(features,weights.view((5,2))))

But, here output is:

tensor([[ 2.8089,  4.6439],
        [-2.3988, -1.9238]])

What I don't understand here is that why matmul() and usual multiplication are giving different outputs, when both are same. Am I doing anything wrong here?

Edit: When, I am using feature of shape (1,5) both * and matmul outputs are same. but, its not the same when the shape is (2,5).


Solution

  • When you use *, the multiplication is elementwise, when you use torch.mm it is matrix multiplication.

    Example:

    a = torch.rand(2,5)
    b = torch.rand(2,5)
    result = a*b 
    

    result will be shaped the same as a or b i.e (2,5) whereas considering operation

    result = torch.mm(a,b)
    

    It will give a size mismatch error, as this is proper matrix multiplication (as we study in linear algebra) and a.shape[1] != b.shape[0]. When you apply the view operation in torch.mm you are trying to match the dimensions.

    In the special case of the shape in some particular dimension being 1, it becomes a dot product and hence sum (a*b) is same as mm(a, b.view(5,1))