Search code examples
c++numpyconvolutiontensordot

How would I write numpy.tensordot in c++?


I'm trying to replicate numpy.tensordot in c++. The example in the numpy documentation shows a nested loop that I can get to work, but what if instead of

c = np.tensordot(a,b, axes=([1,0],[0,1]))

I want to do:

c = np.tensordot(a,b, axes=([1,2],[0,1]))

What would that new nested loop look like in python? And is there an easier/faster way of doing this operation in c++? Right now I'm using the same nested "for" loops with std::vector's in c++. I've seen a few libraries that might help, but I'm trying to use just the c++ standard library.

Here is that numpy example, and the link to the documentation: https://numpy.org/doc/stable/reference/generated/numpy.tensordot.html

Examples

A “traditional” example:

>>>
a = np.arange(60.).reshape(3,4,5)
b = np.arange(24.).reshape(4,3,2)
c = np.tensordot(a,b, axes=([1,0],[0,1]))
c.shape
(5, 2)
c
array([[4400., 4730.],
       [4532., 4874.],
       [4664., 5018.],
       [4796., 5162.],
       [4928., 5306.]])
# A slower but equivalent way of computing the same...
d = np.zeros((5,2))
for i in range(5):
  for j in range(2):
    for k in range(3):
      for n in range(4):
        d[i,j] += a[k,n,i] * b[n,k,j]
c == d
array([[ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True]])

Thank you


Solution

  • I find rewriting to np.einsum first is helpful as the resulting for loop code looks quite similar conceptually:

    a = np.random.rand(16, 8, 2)
    b = np.random.rand(8, 2, 1)
    
    
    c =  np.tensordot(a, b, axes=([1,2],[0,1]))
    
    # same thing written with einsum
    c_ein = np.einsum("ijk,jko->io", a, b)
    
    # same thing done with for loops, 
    # notice how we can use the same letters and indexing as einsum
    c_manual = np.zeros((16, 1))
    for i in range(16):
        for o in range(1):
            # j and k are summed since they don't appear in output
            total = 0
            for j in range(8):
                for k in range(2):
                    total += a[i, j, k] * b[j, k, o]
            c_manual[i, o] = total
    
    assert np.allclose(c, c_ein, c_manual)