I have a function that takes in an array an returns an upsampled version of it by using linear interpolation between the existing data points. The input array has approximately 5000 elements. I would also like to implement a version that handles 2d arrays with shape ~(4000, 5000).
Why is this function not faster with numba njit? It runs superfast on 1d arrays without numba. With numba.njit it takes much longer. Any advice is much appreciated!
import numpy as np
import numba as nb
@nb.njit
def sync_sampl(time_series, repeat=10):
if time_series.ndim == 1:
time_series = time_series.repeat(repeat)
# from https://mail.python.org/pipermail/scipy-user/2010-November/027574.html
# by Dharas Pothina
x = time_series
xtmp = x.copy()
xtmp[1:] = x[1:] - x[:-1]
x_idx = np.nonzero(xtmp)[0]
x_vals = x[x_idx]
x_new = np.interp(np.arange(len(x)), x_idx, x_vals)
x_new = x_new[:-(repeat - 1) or None] # removing last elements to not get straight line at the end.
return x_new
else:
# Implement 2d version here?
pass
Here are some tips to make the Numba code faster:
@nb.njit('float[::1](float32[::1],int32)')
.parallel=True
on @nb.njit
to execute some Numpy function in parallel. However, most functions a not yet running in parallel with it. Still, you can run loops in parallel with that and nb.prange
.xtmp = x.copy()
to x_vals = x[x_idx]
can be rewritten using one fast memory-efficient parallel loop (and so no temporary buffers).fastmath=True
if you are sure that there is no NaN/+Inf/-Inf/-0 values in your code (and the need for exact IEEE-754 rules like rounding) to improve performance even further. Using 32-bit floats may help too despite the significant loss of precision.