Search code examples
numpyscipylapackintel-mklnumpy-einsum

In Multi-dimensional array products, how do I align axes with and without summation?


What is the best way to do array operations when there are some repeated indices which are summed over AND others which are not? It seems like I may have to use einsum for these operations, but it would be better if there was a tensordot alternative with a flag for dimensions that are aligned but not summed.

Does anybody know of a fast numerical routine (in lapack perhaps?) that behaves like tensordot, except that some axes can be aligned without being summed?

==

Here is an example code to show type of array operation that is needed. The operation I need is done by method_sum, method_einsum, and method_matmul. A similar operation sums over the matching j-axis and is done by method2_einsum and method2_tensordot.

By comparing times it seems that tensordot should be able to beat einsum on the first problem. However, it does not have the functionality to align axes without summing them up.

#import scipy
import scipy as sp

# Shapes of arrays
I = 200
J = 50
K = 200
L = 100

a = sp.ones((I, J, L))
b = sp.ones((J, K, L))


# The desired product has a sum over the l-axis

## Use broadcasting to multiply and sum over the last dimension
def method_sum(a, b):
    "Multiply arrays and sum over last dimension."   
    c = (a[:, :, None, :] * b[None, :, :, :]).sum(-1)
    return c

## Use einsum to multiply arrays and sum over the l-axis
def method_einsum(a, b):
    "Multiply arrays and sum over last dimension."
    c = sp.einsum('ijl,jkl->ijk', a, b)
    return c

## Use matmul to multiply arrays and sum over one of the axes
def method_matmul(a, b):
    "Multiply arrays using the new matmul operation."
    c = sp.matmul(a[:, :, None, None, :], 
                  b[None, :, :, :, None])[:, :, :, 0, 0]
    return c


# Compare einsum vs tensordot on summation over j and l

## Einsum takes about the same amount of time when j is not summed over) 
def method2_einsum(a, b):
    "Multiply arrays and sum over last dimension."
    c = sp.einsum('ijl,jkl->ik', a, b)
    return c

## Tensor dot can do this faster but it always sums over the aligned axes
def method2_tensordot(a, b):
    "Multiply and sum over all overlapping dimensions."
    c = sp.tensordot(a, b, axes=[(1, 2,), (0, 2,)])
    return c

Here are some times for the various routines on my computer. Tensordot can beat einsum for method2 because it uses multiple cores. I would like to achieve performance like tensordot for the calculation where both the J and L-axes are aligned, but only the L-axis is summed.

Time for method_sum:
1 loops, best of 3: 744 ms per loop

Time for method_einsum:
10 loops, best of 3: 95.1 ms per loop

Time for method_matmul:
10 loops, best of 3: 93.8 ms per loop

Time for method2_einsum:
10 loops, best of 3: 90.4 ms per loop

Time for method2_tensordot:
100 loops, best of 3: 10.9 ms per loop

Solution

  • In [85]: I,J,K,L=2,3,4,5
    In [86]: a=np.ones((I,J,L))
    In [87]: b=np.ones((J,K,L))
    
    In [88]: np.einsum('ijl,jkl->ijk',a,b).shape
    Out[88]: (2, 3, 4)
    

    Playing around with the new @ operator, I find I can produce:

    In [91]: (a[:,:,None,None,:]@b[None,:,:,:,None]).shape
    Out[91]: (2, 3, 4, 1, 1)
    
    In [93]: (a[:,:,None,None,:]@b[None,:,:,:,None])[...,0,0].shape
    Out[93]: (2, 3, 4)
    

    The shapes are right, but I haven't checked the values. Some of the None line up the ijk axes, and two produce the regular dot behavior (last axis with 2nd to the last).

    With your sizes the times are about the same:

    In [97]: np.einsum('ijl,jkl->ijk',a,b).shape
    Out[97]: (200, 50, 200)
    In [98]: (a[:,:,None,None,:]@b[None,:,:,:,None])[...,0,0].shape
    Out[98]: (200, 50, 200)
    In [99]: timeit np.einsum('ijl,jkl->ijk',a,b).shape
    1 loop, best of 3: 2 s per loop
    In [100]: timeit (a[:,:,None,None,:]@b[None,:,:,:,None])[...,0,0].shape
    1 loop, best of 3: 2.06 s per loop