Search code examples
pythonnumpyfor-loopscipyinterpolation

Multiple 1D interpolation in 2D array without loop


I have an np array with ndim = 2. I need to sample interpolated values along one dimension, and I would like to do it as efficiently as possible.

I came up with this solution:

for i in range(my_array.shape[0]):
    my_interp_array[i, :] = np.interp(sample_y , np.arange(array_size_y), my_array[i,:])

... with sample_y being some non-equidistant sampling vector. Although this gives the desired result, this is probably a very inefficient solution.

I also tried scipy.interpolate.interp1d, as it was suggested by others, like this:

y = np.arange(array_size_y)  # equidistant sampling vector
intf = interp1d(y, my_array)  # interpolation function
my_interp_array = intf(np.tile(sample_y, (len(y), 1)))

... but this actually takes >3000x longer than the first idea with the for loop, which confuses me.

Has someone an idea how to improve this and why the scipy interpolation is so much slower for me?

Thanks in advance!


EDIT: I made a reproducible example as some asked for this. Following code gives me results that show that scipy.interpolate.interp1d is 800-1000 times slower than the method with the for loop.

import time
import numpy as np
from scipy.interpolate import interp1d

size_x, size_y = 512, 512

# make dummy arrays
my_array = np.random.rand(size_x, size_y)
my_interp_array = np.zeros_like(my_array)

# make indices
iy = np.arange(size_y)  # "normal" indices
sample_y = iy + (iy ** 2 / np.max(iy ** 2))  # indices from where to get the values
sample_y[-1] = sample_y[-2]  # avoid index out of range error

# for loop method
start1 = time.time()
for i in range(size_y):
    my_interp_array[i, :] = np.interp(sample_y, iy, my_array[i, :])
end1 = time.time()

# scipy interp1d method
start2 = time.time()
intf = interp1d(np.arange(size_y), my_array)
my_interp_array = intf(np.tile(sample_y, (size_y, 1)))
end2 = time.time()

# print times
print(f"interp1d is {(end2 - start2) / (end1 - start1)} times slower than the for loop")

This is less bad than factor 3000, but the difference is still remarkable in my opinion.


Solution

  • You can use scipy.interpolate.make_interp_spline with k=1 (for linear interpolation) and axis=1 (to match the same axis of interpolation as the looping code).

    from scipy.interpolate import make_interp_spline
    
    f = make_interp_spline(iy, my_array, k=1, axis=1)
    my_interp_array = f(sample_y)
    

    Time comparison:

    def original(iy, my_array, sample_y):
        my_interp_array = np.zeros_like(my_array)
        for i in range(size_y):
            my_interp_array[i, :] = np.interp(sample_y, iy, my_array[i, :])
        return my_interp_array
    
    def mine(iy, my_array, sample_y):
        f = make_interp_spline(iy, my_array, k=1, axis=1)
        my_interp_array = f(sample_y)
        return my_interp_array
    
    %timeit original(iy, my_array, sample_y)
    3.38 ms ± 82.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    %timeit mine(iy, my_array, sample_y)
    1.64 ms ± 20.1 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
    

    So, this approach results in a little more than a 2x speed up.