Index based dot product with numpy

I am trying to optimize a transformation problem and let numpy do as much heavy-lifting as possible.

In my case I have a range of coordinate sets that each have to be dotted with corresponding indexed roll/pitch/yaw values.

Programatically it looks like this:

In [1]: import numpy as np 
   ...: from ds.tools.math import rotation_array 
   ...: from math import pi 
   ...:  
   ...: rpy1 = rotation_array(pi, 0.001232, 1.234243) 
   ...: rpy2 = rotation_array(pi/1, 1.325, 0.5674543)
In [2]: rpy1                                                                                                           
Out[2]: 
array([[ 3.30235500e-01,  9.43897768e-01, -1.23199969e-03],
       [ 9.43898485e-01, -3.30235750e-01,  1.22464587e-16],
       [-4.06850342e-04, -1.16288264e-03, -9.99999241e-01]])

In [3]: rpy2                                                                                                           
Out[3]: 
array([[ 2.05192356e-01,  1.30786082e-01, -9.69943863e-01],
       [ 5.37487075e-01, -8.43271987e-01,  2.97991829e-17],
       [-8.17926489e-01, -5.21332290e-01, -2.43328794e-01]])
   ...:  
   ...: a1 = np.array([[-9.64996132, -5.42488639, -3.08443], 
   ...:                [-8.08814188, -4.56431952, -3.01381]]) 
   ...:  
   ...: a2 = np.array([[-6.91346292, -3.91137259, -2.82621], 
   ...:                [-4.34534536, -2.34535546, -4.87692]])

Then I dot the coordinates in a1 with rpy1 and a2 with rpy2

In [4]: a1.dot(rpy1)                                                                                                   
Out[4]: 
array([[-8.30604694, -7.31349869,  3.09631641],
       [-6.97801968, -6.12357288,  3.0237723 ]])

In [5]: a2.dot(rpy2)                                                                                                   
Out[5]: 
array([[-1.20926993,  3.86756074,  7.3933692 ],
       [ 1.83673215,  3.95195774,  5.40143613]])

Instead of iterating over lists of a's and rpy's I want to do the whole thing in one operation. So I was hoping for that effect with the following code, so that each set of coordinates in a12 would be dotted with the corresponding indexed array from rpy_a.

But as it is clear, from the output I an getting more than I was hoping for:

In [6]: rpy_a = np.array([rpy1, rpy2]) 
   ...:  
   ...: a12 = np.array([a1, a2]) 

In [7]: a12.dot(rpy_a)                                                                                                 
Out[7]: 
array([[[[-8.30604694, -7.31349869,  3.09631641],
         [-2.37306761,  4.92058705, 10.1104514 ]],

        [[-6.97801968, -6.12357288,  3.0237723 ],
         [-1.6478126 ,  4.36234287,  8.57839034]]],


       [[[-5.9738597 , -5.23064061,  2.83472524],
         [-1.20926993,  3.86756074,  7.3933692 ]],

        [[-3.64678058, -3.32137028,  4.88226976],
         [ 1.83673215,  3.95195774,  5.40143613]]]])

What I need is:

array([[[-8.30604694, -7.31349869,  3.09631641],
        [-6.97801968, -6.12357288,  3.0237723 ]],

       [[-1.20926993,  3.86756074,  7.3933692 ],
        [ 1.83673215,  3.95195774,  5.40143613]]])

Can anyone tell me how to achieve this?

EDIT: Runnable example:

import numpy as np

rpy1 = np.array([[ 3.30235500e-01,  9.43897768e-01, -1.23199969e-03],
                 [ 9.43898485e-01, -3.30235750e-01,  1.22464587e-16],
                 [-4.06850342e-04, -1.16288264e-03, -9.99999241e-01]])

rpy2 = np.array([[ 2.05192356e-01,  1.30786082e-01, -9.69943863e-01],
                 [ 5.37487075e-01, -8.43271987e-01,  2.97991829e-17],
                 [-8.17926489e-01, -5.21332290e-01, -2.43328794e-01]])

a1 = np.array([[-9.64996132, -5.42488639, -3.08443],
               [-8.08814188, -4.56431952, -3.01381]])

a2 = np.array([[-6.91346292, -3.91137259, -2.82621],
               [-4.34534536, -2.34535546, -4.87692]])

print(a1.dot(rpy1))
# array([[-8.30604694, -7.31349869,  3.09631641],
#        [-6.97801968, -6.12357288,  3.0237723 ]])
print(a2.dot(rpy2))
# array([[-1.20926993,  3.86756074,  7.3933692 ],
#        [ 1.83673215,  3.95195774,  5.40143613]])

rpy_a = np.array([rpy1, rpy2])
a12 = np.array([a1, a2])

print(a12.dot(rpy_a))
# Result:
# array([[[[-8.30604694, -7.31349869,  3.09631641],
#          [-2.37306761,  4.92058705, 10.1104514 ]],
#         [[-6.97801968, -6.12357288,  3.0237723 ],
#          [-1.6478126 ,  4.36234287,  8.57839034]]],
#        [[[-5.9738597 , -5.23064061,  2.83472524],
#          [-1.20926993,  3.86756074,  7.3933692 ]],
#         [[-3.64678058, -3.32137028,  4.88226976],
#          [ 1.83673215,  3.95195774,  5.40143613]]]])

# Need:
# array([[[-8.30604694, -7.31349869,  3.09631641],
#         [-6.97801968, -6.12357288,  3.0237723 ]],
#        [[-1.20926993,  3.86756074,  7.3933692 ],
#         [ 1.83673215,  3.95195774,  5.40143613]]])

Solution

Assuming you want to treat an arbitrary number of arrays rpy1, rpy2, ..., rpyn and a1, a2, ..., an, I suggest explicit first axis concatenation with explicit broadcasting, simply for the cause of "explicit is better than implicit":

a12 = np.concatenate([_a[None, ...] for _a in (a1, a2)], axis=0)
rpy_a = np.concatenate([_a[None, ...] for _a in (rpy1, rpy2)], axis=0)

This is equal to:

a12 = np.array([a1, a2])
rpy_a = np.array([rpy1, rpy2])

np.array requires less code and is also faster than my explicit approach, but I just like defining the axis explicitly so that everyone reading the code can guess the resulting shape without executing it.

Whatever path you choose, the important part is the following:

np.einsum('jnk,jkm->jnm', a12, rpy_a)
# Out:
array([[[-8.30604694, -7.3134987 ,  3.09631641],
        [-6.97801969, -6.12357288,  3.0237723 ]],

       [[-1.20926993,  3.86756074,  7.3933692 ],
        [ 1.83673215,  3.95195774,  5.40143613]]])

Using the Einstein summation convention, you can define your np.matmul (equal to np.dot for 2D-arrays) to be executed along a specific axis.
In this case, we define the concatenation axis j (or first dim. or axis 0) to be the shared axis, along which the operation 'nk,km->nm' (equal to np.matmul, see the signature description in the out parameter) is performed.

It is also possible to simply use np.matmul (or the python operator @) for the same results:

np.matmul(a12, rpy_a)
a12 @ rpy_a

But again: For the general case, where the concatenation axis or shapes may change, the more explicit np.einsum is preferable. If you know that no changes will be made to shapes etc., np.matmul should be preferred (less code and faster).