Search code examples
pythonarraysnumpydot-product

Is there a reduced form of the dot product in numpy?


I am trying to find a numpy operation that gives me the scalar products between all vectors of a 2d array at index i of a 3d array and the vector at index i of a 2d array. Let me give you an example to explain what I am thinking of:

x = np.array([[[1,2,3],
              [2,3,4]],
             
             [[11,12,13],
              [12,13,14]]])

y = np.array([[1,1,1],
              [2,2,2]])



np.?operation?(x,y.T)

output:
[[[1 *1 + 1 *2 + 1 *3], 
  [1 *2 + 1 *3 + 1 *4]],

 [[2 *11 + 2 *12 + 2 *13], 
  [2 *12 + 2* 13 + 2 *14]]]

= [[[6], 
    [9]],

   [[72], 
    [78]]]

As you can see, I am basically looking for a reduced dot product operation. The dot product of x and y would yield the following:

np.dot(x, y.T)

output:
[[[ 6 12]
  [ 9 18]]

 [[36 72]
  [39 78]]]

Or is there a way to extract the results I need from the dot product result?

I have also tried np.tensordot(x,y,axis) but I were not able to figure which tuples I should put for -axis-. I have also come across the np.einsum() operation but couldn't work my head around how this could help me with my problem.


Solution

  • It should be doable with np.einsum or np.matmul/@ which has a "batch" operation on the leading dimension. But sorting out your dimensions, and getting the (2,2,1) shape is a bit tricky.

    Your np.dot(x, y.T) gives the numbers you want, but you have to extract a kind of diagonal on 2, while retaining a dimension.

    Here's one way of doing this - it isn't the fastest or succinct, but should help me wrap my mind around the dimensions.

    In [432]: y[:,None,:]
    Out[432]: 
    array([[[1, 1, 1]],
    
           [[2, 2, 2]]])
    In [433]: y[:,None,:].repeat(2,1)
    Out[433]: 
    array([[[1, 1, 1],
            [1, 1, 1]],
    
           [[2, 2, 2],
            [2, 2, 2]]])
    In [435]: x*y[:,None,:].repeat(2,1)
    Out[435]: 
    array([[[ 1,  2,  3],
            [ 2,  3,  4]],
    
           [[22, 24, 26],
            [24, 26, 28]]])
    In [436]: (x*y[:,None,:].repeat(2,1)).sum(axis=-1, keepdims=True)
    Out[436]: 
    array([[[ 6],
            [ 9]],
    
           [[72],
            [78]]])
    

    We don't need the repeat, broadcasting will take its place:

    (x*y[:,None,:]).sum(axis=-1, keepdims=True)
    

    This einsum does the same as the dot/@:

    In [441]: np.einsum('ijk,lk->ijl',x,y)
    Out[441]: 
    array([[[ 6, 12],
            [ 9, 18]],
    
           [[36, 72],
            [39, 78]]])
    

    Change the indices a bit to get the "diagonal" (i in all terms)

    In [442]: np.einsum('ijk,ik->ij',x,y)
    Out[442]: 
    array([[ 6,  9],
           [72, 78]])
    

    and add a trailing dimension:

    In [443]: np.einsum('ijk,ik->ij',x,y)[:,:,None]
    Out[443]: 
    array([[[ 6],
            [ 9]],
    
           [[72],
            [78]]])
    

    Now that I have the einsum I can visualize the matmul/@ dimensions. I need to the treat the first dimension of both as the 'batch', and add a new trailing dimension to y, making it (2,3,1). (2,2,3) with (2,3,1) => (2,2,1) with sum-of-products on the 3.

    In [445]: x@y[:,:,None]
    Out[445]: 
    array([[[ 6],
            [ 9]],
    
           [[72],
            [78]]])
    

    If x and y were (4,2,3) and (4,3) shaped, this dimension matching would have been more obvious.

    In [446]: X=x.repeat(2,0)
    In [447]: Y=y.repeat(2,0)
    In [448]: X.shape
    Out[448]: (4, 2, 3)
    In [449]: Y.shape
    Out[449]: (4, 3)
    In [450]: X@Y[:,:,None]    # (4,2,1)
    Out[450]: 
    array([[[ 6],
            [ 9]],
    
           [[ 6],
            [ 9]],
    
           [[72],
            [78]],
    
           [[72],
            [78]]])
    

    With these shapes it's more obvious that 4 is the batch, and 3 is the sum-of-products.