I'm trying to use numpy.einsum
to simplify a loop I have in my code.
Currently, my code looks something like this:
k = 100
m = 50
n = 10
A = np.arange(k*m*n).reshape(k, m, n)
B = np.arange(m*m).reshape(m, m)
T = np.zeros((n, n))
for ind in xrange(k):
T += np.dot(A[ind,:,:].T, np.dot(B, A[ind,:,:]))
I'm trying to use numpy.einsum
as an alternative to this loop:
Tp = np.einsum('nij,njk->ik', np.einsum('nij,jk->nik', A.transpose(0,2,1), B), A)
print np.allclose(T, Tp)
Is it possible to use a single numpy.einsum
instead of two?
Really smart approaches you got in the other answer. I would like to add numpy.dot
based approach into the mix that also uses some reshaping
. Here's one way to do so -
k,m,n = A.shape
((A.transpose(2,0,1).reshape(-1,m).dot(B.T)).reshape(n,-1)).dot(A.reshape(-1,n)).T
Runtime tests -
This section compares all the approaches listed in the other answer
and numpy.dot
based one listed earlier in this post.
In [130]: k = 100
...: m = 50
...: n = 10
...: A = np.arange(k*m*n).reshape(k, m, n)
...: B = np.arange(m*m).reshape(m, m)
...:
In [131]: %timeit np.einsum('nij,njk->ik', np.einsum('nij,jk->nik', A.transpose(0,2,1), B), A)
100 loops, best of 3: 10.7 ms per loop
In [132]: %timeit np.einsum('nij, il, kln ->jk', A, B, A.T)
10 loops, best of 3: 105 ms per loop
In [133]: %timeit np.tensordot(A, np.tensordot(A, B, axes=(1, 1)), axes=((0, 1), (0, 2)))
100 loops, best of 3: 6.22 ms per loop
In [134]: %timeit ((A.transpose(2,0,1).reshape(-1,m).dot(B.T)).reshape(n,-1)).dot(A.reshape(-1,n)).T
100 loops, best of 3: 5.3 ms per loop