Matrix of norm differences from two coordinates matrices (numpy)

Let v, u be the 3D-coordinate matrices of shapes (n, 3) and (m, 3), respectively.

I want to compute the matrix M of shape (n, m) such that M[i, j] = ||v[i] - u[j]|| -- the second vector norm. The task can be solved via cycle:

n, m = 100, 200
M = np.zeros((n, m))
v = np.random.rand(n, 3)
г = np.random.rand(m, 3)

for i in range(n):
    for j in range(m):
        M[i, j] = np.linalg.norm(v[i] - u[j], ord=2)

Instead, how can I do that in vectorized way in numpy?

Solution

You can vectorize by taking advantage of array broadcasting (see https://numpy.org/doc/stable/user/basics.broadcasting.html). If array v has shape (n, 1, 3), and array u has shape (m, 3), the array that results from applying an arithmetic operation like v - u will have shape (n, m, 3). You can then apply the linalg norm operation along an axis by specifying the axis in the call to np.linalg.norm

import numpy as np

n, m = 100, 200
v = np.random.rand(n, 3)
u = np.random.rand(m, 3)

def looped(v, u):
    n = v.shape[0]
    m = u.shape[0]
    M = np.zeros((n, m))
    for i in range(n):
        for j in range(m):
            M[i, j] = np.linalg.norm(v[i] - u[j], ord=2)
    return M

def broadcast(v, u):
    diff = v.reshape(-1, 1, 3) - u
    return np.linalg.norm(diff, ord=2, axis=-1)

# crude evidence that the two functions do the same thing
looped_result = looped(v, u)
broadcast_result = broadcast(v, u)
print(np.linalg.norm(looped_result-broadcast_result))
# 0.0

This has a massive speedup, as you might expect.

%timeit looped(v,u)
263 ms ± 32 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit broadcast(v,u)
648 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

It takes some time to get used to array broadcasting, but that's just practice.