Let v, u be the 3D-coordinate matrices of shapes (n, 3) and (m, 3), respectively.
I want to compute the matrix M of shape (n, m) such that M[i, j] = ||v[i] - u[j]|| -- the second vector norm. The task can be solved via cycle:
n, m = 100, 200
M = np.zeros((n, m))
v = np.random.rand(n, 3)
г = np.random.rand(m, 3)
for i in range(n):
for j in range(m):
M[i, j] = np.linalg.norm(v[i] - u[j], ord=2)
Instead, how can I do that in vectorized way in numpy?
You can vectorize by taking advantage of array broadcasting (see https://numpy.org/doc/stable/user/basics.broadcasting.html). If array v
has shape (n, 1, 3)
, and array u
has shape (m, 3)
, the array that results from applying an arithmetic operation like v - u
will have shape (n, m, 3)
. You can then apply the linalg norm operation along an axis by specifying the axis in the call to np.linalg.norm
import numpy as np
n, m = 100, 200
v = np.random.rand(n, 3)
u = np.random.rand(m, 3)
def looped(v, u):
n = v.shape[0]
m = u.shape[0]
M = np.zeros((n, m))
for i in range(n):
for j in range(m):
M[i, j] = np.linalg.norm(v[i] - u[j], ord=2)
return M
def broadcast(v, u):
diff = v.reshape(-1, 1, 3) - u
return np.linalg.norm(diff, ord=2, axis=-1)
# crude evidence that the two functions do the same thing
looped_result = looped(v, u)
broadcast_result = broadcast(v, u)
print(np.linalg.norm(looped_result-broadcast_result))
# 0.0
This has a massive speedup, as you might expect.
%timeit looped(v,u)
263 ms ± 32 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit broadcast(v,u)
648 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
It takes some time to get used to array broadcasting, but that's just practice.