python numpy image-processing mahalanobis

The most efficient way to processing images (mahalanobis distances)

I've written a script based on this description.

There are few images as 2D numpy arrays, if images are big, it takes a long time to calculate each values. Is there better way do do this than my solution?

Consider three images, means vector and inverted covariance matrix:

a = image1
b = image2
c = image3 # numpy arrays, NxM size
m = np.array([m1, m2, m3]) # means vector
covariance = np.cov([a.ravel(), b.ravel(), c.ravel()])
inv_cov = np.linalg.inv(covariance) # inv. covariance matrix

I solved this problem with a double loop:

result = np.zeros((y,x)) # y,x are sizes of images

for i in xrange(y):
    for j in xrange(x):
        v = np.array([a[i,j], b[i,j], c[i,j]]) # array with particular pixels from each image
        result[i,j] = mahalanobis(v,m,inv_cov) # calculate mahalanobis distance and insert value as a pixel

I think that there could be a better way to do this faster. Maybe without loops?

Solution

Use scipy.spatial.distance.cdist to compute the distance between each pair of points from 2 collections of inputs.

import numpy as np
import scipy.spatial.distance as SSD
h, w = 40, 60
A = np.random.random((h, w))
B = np.random.random((h, w))
C = np.random.random((h, w))
M = np.array([A.mean(), B.mean(), C.mean()])  # means vector
covariance = np.cov([A.ravel(), B.ravel(), C.ravel()])
inv_cov = np.linalg.inv(covariance)  # inv. covariance matrix

def orig(A, B, C, M, inv_cov):
    h, w = A.shape
    result = np.zeros_like(A, dtype='float64')

    for i in range(h):
        for j in range(w):
            # array with particular pixels from each image
            v = np.array([A[i, j], B[i, j], C[i, j]])
            # calculate mahalanobis distance and insert value as a pixel
            result[i, j] = SSD.mahalanobis(v, M, inv_cov)
    return result

def using_cdist(A, B, C, M, inv_cov):
    D = np.dstack([A, B, C]).reshape(-1, 3)
    result = SSD.cdist(D, M[None, :], metric='mahalanobis', VI=inv_cov)
    result = result.reshape(A.shape)
    return result

expected = orig(A, B, C, M, inv_cov)
result = using_cdist(A, B, C, M, inv_cov)
assert np.allclose(result, expected)

Even for three small (40x60) images, using_cdist is ~285x faster:

In [49]: expected = orig(A, B, C, M, inv_cov)

In [76]: result = using_cdist(A, B, C, M, inv_cov)

In [78]: np.allclose(result, expected)
Out[78]: True

In [79]: %timeit orig(A, B, C, M, inv_cov)
10 loops, best of 3: 36.3 ms per loop

In [80]: %timeit using_cdist(A, B, C, M, inv_cov)
10000 loops, best of 3: 127 µs per loop