Search code examples
pythonnumpyinterpolationdata-analysisbilinear-interpolation

Putting contributions of continuous values in a discrete 2D grid, based on distance from the nearest pixels


I have a numpy array containing coordinates of points (in 3D, but I am have started off by trying the method in 1D and 2D first) that I would like to fit in a discrete grid. However, I do not want to just move the points to the grid pixel which they are closest to, but rather put on each pixel a weighted value which depends on how far from that pixel the point actually is. For example, if in 1D I have a point x = 15.4, in my discrete space this point would contribute 60% to pixel 15 and 40% to pixel 16. I have managed to do this in 1D and the code is as follows:

# Make test data

N = 1000

data = np.random.normal(
    loc = 10,
    scale = 2,
    size = N
)

# set data range, binsize, and coordinates
xmin = 0
xmax = 20
xbinsize = 0.5

# define the bin centres
xbincenters = np.arange(
    xmin+xbinsize*0.5,
    xmax,
    xbinsize
)

# transform data and floor it to get the closest pixel in the discrete grid
data_t = data/xbinsize - 0.5
bin_indices = np.floor(data_t).astype(int)
# calculate the difference between continuous coordinate and closest pixel
dx = data_t - bin_indices

# get data range in bin indices

index_min = np.ceil(xmin/xbinsize - 0.5).astype(int)
index_max = np.ceil(xmax/xbinsize - 0.5).astype(int)

# do the interpolation

minlength = index_max - index_min

output_array = np.bincount(
    bin_indices,
    weights = (1-dx),
    minlength = minlength
) +\
np.bincount(
    bin_indices+1,
    weights = dx,
    minlength = minlength
)

# finally plot a histogram of the continuous values against output_array
fig,ax = plt.subplots()
ax.hist(data, bins=np.arange(0,20,0.5))
ax.plot(
    xbincenters,
    output_array,
    color = 'r'
)
plt.show()

comparison plot

I have been trying to generalise this to 2D but to no avail, since np.bincount() only works on 1D arrays.

Up until trying to find the nearest pixels, I feel like the code should be similar, as all the operations above can be broadcasted to a 2D array. In this case, we need to distribute any coordinate to the nearest 4 pixels, depending on a difference dx and now another difference dy (which, if you used the above code for an array of shape (1000, 2) would just be the 1st and second column of the array dx respectively). I tried unraveling my bin_indices and dx arrays to use np.bincount() on them, but at this point I am not sure if that is correct or what operations to do on the unraveled arrays. Would generalising this problem in 2D (and later in 3D) need a different kind of approach? Any help would be appreciated, thank you in advance!


Solution

  • So, if I get it correctly, you did "by hand" all the interpolation job (there are probably some code to do that somewhere, but can't think of any right now), and use bincount just to increase the output array (because output_array[indices] += weight wouldn't have worked, indeed, if indices contain repetitions)

    Then, you could just modify indices to target yourself a 2D arrays. Using a width and a height, and using indicesy*width+indicesx as indices. And then "unravel" the array once finished.

    Like this

    import numpy as np
    import matplotlib.pyplot as plt
    
    N = 1000
    
    # Big array for performance test
    datax = np.random.normal(
        loc = 10,
        scale = 2,
        size = N
    )
    datay = np.random.normal(
        loc = 10,
        scale = 2,
        size = N
    )
    
    # Or small one for result test (comment if not wanted)
    datax = np.array([15.2, 1.5, 15.3])
    datay = np.array([5.1, 10.5, 15.9])
    
    
    # set data range, binsize, and coordinates
    xmin = 0
    xmax = 20
    xbinsize = 0.5
    ymin = 0
    ymax = 20
    ybinsize = 0.5
    
    # define the bin centres
    xbincenters = np.arange(
        xmin+xbinsize*0.5,
        xmax,
        xbinsize
    )
    ybincenters = np.arange(
        ymin+ybinsize*0.5,
        ymax,
        ybinsize
    )
    
    # transform data and floor it to get the closest pixel in the discrete grid
    datax_t = datax/xbinsize - 0.5
    datay_t = datay/ybinsize - 0.5
    bin_indicesx = np.floor(datax_t).astype(int)
    bin_indicesy = np.floor(datay_t).astype(int)
    # calculate the difference between continuous coordinate and closest pixel
    dx = datax_t - bin_indicesx
    dy = datay_t - bin_indicesy
    
    # get data range in bin indices
    
    index_minx = np.ceil(xmin/xbinsize - 0.5).astype(int)
    index_miny = np.ceil(ymin/ybinsize - 0.5).astype(int)
    index_maxx = np.ceil(xmax/xbinsize - 0.5).astype(int)
    index_maxy = np.ceil(ymax/ybinsize - 0.5).astype(int)
    
    # do the interpolation
    
    minlengthx = index_maxx - index_minx
    minlengthy = index_maxy - index_miny
    minlength=minlengthx*minlengthy
    
    flat_output_array = np.bincount(
        bin_indicesy*minlengthx+bin_indicesx,
        weights = (1-dx)*(1-dy),
        minlength = minlength
    ) +\
    np.bincount(
        bin_indicesy*minlengthx+bin_indicesx+1,
        weights = dx*(1-dy),
        minlength = minlength
    ) +\
    np.bincount(
        (bin_indicesy+1)*minlengthx+bin_indicesx,
        weights = (1-dx)*dy,
        minlength = minlength
    ) +\
    np.bincount(
        (bin_indicesy+1)*minlengthx+bin_indicesx+1,
        weights = dx*dy,
        minlength = minlength
    ) 
    
    output_array=flat_output_array.reshape(-1, minlengthx)
    
    # finally plot a histogram of the continuous values against output_array
    plt.imshow(output_array)
    
    plt.show()
    

    3 points

    1000 normal distribution points