Search code examples
pythonarraysnumpynumpy-einsumtensordot

One line einsum functions with "interleaved" output indexing impossible to recreate using tensordot?


The similarities and differences between NumPy's tensordot and einsum functions are well documented and have been extensively discussed in this forum (e.g. [1], [2], [3], [4], [5]). However, I've run into an instance of matrix multiplication using einsum that I'm finding very difficult, if not impossible, to replicate using tensordot: If our two arrays are,

>>> A = np.array([[0, 1], [1, 0]])
>>> B = np.arange(2 ** 4).reshape((2, 2, 2, 2))

does there exist a one line tensordot equivalent to the following?

>>> np.einsum("ab,ibjk->iajk", A, B)
array([[[[ 4,  5],
         [ 6,  7]],

        [[ 0,  1],
         [ 2,  3]]],


       [[[12, 13],
         [14, 15]],

        [[ 8,  9],
         [10, 11]]]]) 

From what I've found, the answer seems to be "no". The trouble arises in the indexing of the output dimension, iajk. Here, dimension a of array A appears in-between dimensions i and j of array B. Had the indexing of the output dimension instead been aijk, np.tensordot(A, B, (1, 1)) would have worked fine. I ran a test using all possible axes to be sure,

>>> output_einsum = np.einsum("ab,ibjk->iajk", A, B)
>>> axes_A = [-2, -1, 0, 1]
>>> axes_B = [-4, -3, -2, -1, 0, 1, 2, 3]
>>> for i in axes_A:
...     for j in axes_B:
...         output_tensordot = np.tensordot(A, B, axes=(i, j))
...         if np.allclose(ouput_einsum, output_tensordot):
...             print(i,j)
...

and found that no combination of the allowed axes produced the desired result. Note that the dimension of B limits each element of the axes parameter to length one. Is it correct that einsum functions with interleaving output dimensions cannot be reproduced in one line using tensordot? And if is so, does there exist a multi-line work-around?


Solution

  • As I stress in my earlier answers, tensordot is an extension of np.dot, allowing us to specify which dimensions are used in the sum-of-products. The dot default is last of A, 2nd to the last of B.

    This illustrates how dot handles dimensions greater than 2:

    In [158]: np.dot(np.ones((2,3,4)),np.ones((5,4,7))).shape
    Out[158]: (2, 3, 5, 7)
    

    The way tensordot puts it, the noncontracted dimensions of B follow those of A. So taking the same arrays, but shifting the axes, produces the same thing.

    In [162]: np.tensordot(np.ones((2,4,3)),np.ones((5,7,4)),(1,2)).shape 
    Out[162]: (2, 3, 5, 7)
    

    In these examples I chose distinct dimensions so the order is more obvious.

    tensordot does not provide a way of reordering the noncontracted dimensions. But you can easily do that yourself after.

    Your example has size 2 dimensions all around. This allows you to specify any combination of axes, but requires the use of allclose to test results.

    In [146]: >>> A = np.array([[0, 1], [1, 0]])
         ...: >>> B = np.arange(2 ** 4).reshape((2, 2, 2, 2))
    

    Performing the sum-of-products on the 2nd axis of both arrays:

    In [147]: C=np.tensordot(A,B,(1,1))
    In [148]: C.shape
    Out[148]: (2, 2, 2, 2)
    In [149]: C
    Out[149]: 
    array([[[[ 4,  5],
             [ 6,  7]],
    
            [[12, 13],
             [14, 15]]],
    
    
           [[[ 0,  1],
             [ 2,  3]],
    
            [[ 8,  9],
             [10, 11]]]])
    

    And the einsum with its default result ordering ('aijk')

    In [150]: D= np.einsum('ab,ibjk',A,B)
    In [151]: np.allclose(C,D)
    Out[151]: True
    

    The tensordot is the equivalent of this dot:

    In [152]: E = np.dot(A,B.reshape(2,2,4))
    In [153]: E.shape
    Out[153]: (2, 2, 4)
    In [154]: np.allclose(C,E.reshape(2,2,2,2))
    Out[154]: True
    In [155]: np.allclose(E,np.einsum('ab,ibk',A,B.reshape(2,2,4)))
    Out[155]: True