Search code examples
pythonnumpyvectorizationmatrix-multiplication

Multiply array of n 3x3 rotation matrices with 3d array of 3-vectors


I have a 3d array of position vectors p [np.shape(p) yields (Nx, Ny, Nz, 3)] and an array Rn of n rotation matrices [np.shape(R) yields (n, 3, 3)].

I am trying to get an array PR of shape (n, Nx, Ny, Nz, 3) where the i-th (0 < i < n) entry at dimension 0 is the 3d array of position vectors p rotated by the 3x3 rotation matrix at index i of array Rn.

theta = np.arange(0, 2*np.pi, np.pi/50)
phi = np.arange(0, np.pi, np.pi/100)

a = np.arange(100)
b = np.arange(50)
p = np.array(np.meshgrid(a, b, a, indexing="xy"))
p = np.moveaxis(p, 1, 2)
p = np.moveaxis(p, 0, 3)
# np.shape(p) => (100,50,100,3) 
Rn = np.array([np.array([np.cos(theta)*np.cos(phi), np.cos(theta)*np.sin(phi), -np.sin(theta)]),
               np.array([-np.sin(phi),              np.cos(phi),               np.zeros(np.shape(phi))]),
               np.array([np.cos(phi)*np.sin(theta), np.sin(theta)*np.sin(phi), np.cos(theta)])])
Rn = np.moveaxis(Rn , 1, 2)
Rn = np.moveaxis(Rn , 0, 1)
# np.shape(Rn) => (100, 3, 3) 

So far I have attempted the following, unsuccessfully.

PR= np.matmul(Rn, p)

What is the most efficient way to perform this operation? I know how to perform this using For loops, but in the interest of efficiency I have been trying to keep things vectorized within numpy.


Solution

  • Two possible solutions are -

    np.einsum("ijkl,nal->nijka", p, Rn, optimize=True)
    td = np.moveaxis(np.tensordot(p, Rn, axes=((-1), (-1))), 3, 0)
    

    I will also compare these solutions with other answers in this thread.

    p = np.random.rand(10, 20, 30, 3)
    Rn = np.random.rand(100, 3, 3)
    
    es = np.einsum("ijkl,nal->nijka", p, Rn, optimize=True)
    td = np.moveaxis(np.tensordot(p, Rn, axes=((-1), (-1))), 3, 0)
    d = np.squeeze(np.moveaxis(np.dot(Rn, p[..., None]), 1, -2), -1)
    out = ((Rn @ p.reshape(-1,3).T)
              .reshape(Rn.shape[0],3,-1) 
              .swapaxes(1,2)
              .reshape(-1, *p.shape)
          )
    
    print(np.allclose(es, out))
    print(np.allclose(td, out))
    print(np.allclose(d, out))
    

    All gives True.

    If you try benchmarking their performance,

    %timeit np.einsum("ijkl,nal->nijka", p, Rn, optimize=True)
    %timeit np.moveaxis(np.tensordot(p, Rn, axes=((-1), (-1))), 3, 0)
    %timeit ((Rn @ p.reshape(-1,3).T).reshape(Rn.shape[0],3,-1) .swapaxes(1,2).reshape(-1, *p.shape))
    %timeit np.moveaxis(np.squeeze(np.dot(Rn, p[..., None]), -1), 1, -1)
    

    Gives,

    3.91 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    4.15 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    2.45 ms ± 29.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    29.1 ms ± 98.9 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

    For an array of the given size on my system.

    einsum and tensordot seems to have comparable performance while the @ solution seems the fastest. The dot solutions seems unreasonably slow though. I am not sure why since I would have imagined it's using @ under the hood.