Search code examples
pythonarraysnearest-neighbor

Enlarging 2D array with a given rescaling factor using nearest-neighbour in Python


I'm trying to write a code that enlarges 2D array with a given rescaling factor in Python using nearest-neighbour algorithm.

For example, I have an array that looks like below.

[[1, 2],
[3, 4]]

And what I want to do is enlarging this array with NN algorithm and a given rescaling factor.

Let me explain step by step. Let's assume that the rescaling factor is 3. The enlarged array should look like below:

[[1, 0, 0, 2, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[3, 0, 0, 4, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]

And after filling the empty elements, it should look like below.

[[1, 1, 2, 2, 2, 2],
[1, 1, 2, 2, 2, 2],
[3, 3, 4, 4, 4, 4],
[3, 3, 4, 4, 4, 4],
[3, 3, 4, 4, 4, 4],
[3, 3, 4, 4, 4, 4]]

This is what output should look like. (0,2) is 2 instead of 1 because its nearest neighbour is 2 at (0,3) not 1 at (0,0).

How can I achieve this?

It was easy to create an array like below:

[[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2], 
[1, 1, 1, 2, 2, 2], 
[3, 3, 3, 4, 4, 4], 
[3, 3, 3, 4, 4, 4], 
[3, 3, 3, 4, 4, 4]]

But It is not what I wanted.


Solution

  • First need to create the padded array, but we will pad the array with np.nan for the interpolation of the next step. Cause if you already have element 0 before padding, then when we calculate the mask with 0s, this will give us a wrong mask. Here is the function for padding :

    def pad_data(arr,padlen):
        m,n = arr.shape
        out= np.empty((m*padlen, n*padlen)) * np.nan
        for i in range(m):
            for j in range(n):
                out[i*padlen, j*padlen] = arr[i,j]
        return out
    

    Then we need to use the NearestNDInterpolator in scipy for the nearest interpolation. The full code as below:

    import numpy as np
    from scipy.interpolate import NearestNDInterpolator
    
    def pad_data(arr,padlen):
        m,n = arr.shape
        out= np.empty((m*padlen, n*padlen)) * np.nan
        for i in range(m):
            for j in range(n):
                out[i*padlen, j*padlen] = arr[i,j]
        return out
    
    
    
    arr = np.array([[1, 2],[3, 4]])
    arr_pad = pad_data(arr,3)
    
    mask = np.where(~np.isnan(arr_pad))
    interp = NearestNDInterpolator(np.transpose(mask), arr_pad[mask])
    filled_data = interp(*np.indices(arr_pad.shape))
    filled_data
    

    Gives you :

    array([[1., 1., 2., 2., 2., 2.],
           [1., 1., 2., 2., 2., 2.],
           [3., 3., 4., 4., 4., 4.],
           [3., 3., 4., 4., 4., 4.],
           [3., 3., 4., 4., 4., 4.],
           [3., 3., 4., 4., 4., 4.]])