Search code examples
numpyscikit-learnscipy

Distance matrix for images (with custom metric) in scipy/sklearn


Consider the following custom metric between 2 images (note: I know it is terrible, it is an example). For each 3*3 kernel above the images, it looks at the number of pixels matching and computes the ratio for that kernel. Finally, it outputs 1 - max(ratio).

def custom_metric(a, b):
    w, h = a.shape
    ratio = []
    for i in range(w // ksize):
        for j in range(h // ksize):
            kernel_diff = a[i*ksize:(i+1)*ksize, j*ksize:(j+1)*ksize] - b[i*ksize:(i+1)*ksize, j*ksize:(j+1)*ksize]
            ratio.append( np.mean(kernel_diff == 0) )
    return 1 - max(ratio)


a = np.array([
    [1,0,0,0,0,1],
    [0,0,1,0,0,0],
    [0,0,0,0,0,0],
    [0,0,0,0,1,0],
    [0,1,0,0,0,1],
    [0,0,0,0,0,0]
])

b = np.array([
    [0,1,0,0,0,0],
    [0,0,1,0,0,1],
    [1,0,0,0,1,0],
    [0,0,0,0,1,0],
    [0,0,0,1,0,1],
    [0,0,0,1,1,1]
])

print(custom_metric(a, b)) # outputs 1/9 because of the bottom left window

All the functions for computing distance matrices in scipy / sklearn that I have seen take as an input an array of shape (n_samples_X, n_features) like sklearn's pairwise_distances. Is there a function allowing higher dimensional arrays, for example of shape (n_samples_X, width, height) I could call my metric on?


Solution

  • Try flattening the features onto a single axis. Rather than having (samples, height, width, channels, ...) you'd reshape your data to a 2D array of (samples, height * width * channels * ...). This will work with an arbitrary number of dimensions which you can then supply to sklearn.

    In your case, try a_reshaped = a.reshape([len(a), -1]). That'll give you a 2D array of shape (samples, height * width) by moving the height and width axes onto the same dimension.


    Demonstrating how to flatten the height and width into a single dimension whilst preserving the contiguity of each patch. Also shows how to run the OP's metric on the flattened array.

    #Reshapes array to 1 x features, preserving
    #patch contiguity
    def special_flatten(arr, k):
        kr, kc = k.tolist()
        nrow = arr.shape[0] // kr
        ncol = arr.shape[1] // kc
        return (
            arr
            .reshape(nrow, kr, ncol, kc)
            .swapaxes(1, 2)
            .reshape(nrow, ncol, kr * kc)
            .reshape(nrow * ncol, kr * kc)
            .reshape(1, -1)
        )
    
    #Computes OP's custom metric for a single sample
    # but using the reshaped array
    def custom_metric(a, b):
        k_area = k.prod()
        sample_num = 0
        ratio = []
        for patch_num in range(a.shape[1] // k_area):
            start = patch_num * k_area
            end = start + k_area
            
            a_vals = a[sample_num, start:end]
            b_vals = b[sample_num, start:end]
            diff = a_vals - b_vals
            ratio.append( (diff == 0).mean() )
        
        return 1 - max(ratio)
    

    Usage:

    a = np.array([
        [1,0,0,0,0,1],
        [0,0,1,0,0,0],
        [0,0,0,0,0,0],
        [0,0,0,0,1,0],
        [0,1,0,0,0,1],
        [0,0,0,0,0,0]
    ])
    
    b = np.array([
        [0,1,0,0,0,0],
        [0,0,1,0,0,1],
        [1,0,0,0,1,0],
        [0,0,0,0,1,0],
        [0,0,0,1,0,1],
        [0,0,0,1,1,1]
    ])
    
    k = np.array([3, 3]) #the kernel
    a_flat = special_flatten(a, k) #reshape to 1 x feats
    b_flat = special_flatten(b, k)
    custom_metric(a_flat, b_flat) # 1 / 9
    

    Not required for the above, but some additional test code I used to sanity check the reshape results:

    a = np.array([[1.1, 1.2,  2.1, 2.2],
                  [1.3, 1.4,  2.3, 2.4],
                  
                  [3.1, 3.2,  4.1, 4.2],
                  [3.3, 3.4,  4.3, 4.4]])
    
    kr = kc = 2
    nrow = a.shape[0] // kr
    ncol = a.shape[1] // kc
    a_r = (
        a
        .reshape(nrow, kr, ncol, kc)
        .swapaxes(1, 2)
        .reshape(nrow, ncol, kr * kc)
        .reshape(nrow * ncol, kr * kc)
        .reshape(1, -1)
    )
    a_r
    # a_r is 1.1 1.2 1.3 1.4  2.1 2.2 2.3 2.4  3.1 3.2 3.3 3.4  4.1 4.2 4.3 4.4
    
    patch_area = kr * kc
    for patch_num in range(a_r.size // patch_area):
        print(patch_num, ':', a_r[0, patch_num * patch_area:(patch_num + 1)*patch_area] )
    
    #Yields:
    #
    # 0 : [1.1 1.2 1.3 1.4]
    # 1 : [2.1 2.2 2.3 2.4]
    # 2 : [3.1 3.2 3.3 3.4]
    # 3 : [4.1 4.2 4.3 4.4]