python-3.x scipy interpolation large-data

How to set up the interpolation problem using ndimage.map_coordinates?

According to the documentation of scipy.ndimage.map_coordinates,

The array of coordinates is used to find, for each point in the output, the corresponding coordinates in the input. The value of the input at those coordinates is determined by spline interpolation of the requested order.

The shape of the output is derived from that of the coordinate array by dropping the first axis. The values of the array along the first axis are the coordinates in the input array at which the output value is found.

I have a discrete 3-d function that is defined on a 3d grid (t, x, y); on every point of this 3d grid, the function has a unique value unless it's value is zero.

I have another set of arrays in the form of a pandas dataframe with three columns, t_new, x_new, and y_new.

I would like to use scipy.ndimage.map_coordinates to interpolate the function in order to calculate its value on the new dataset presented in the said dataframe.

Since I am getting the following error message, I am sure I am not setting up the map_coordinates correctly:

File "D:\Users\username\Anaconda3\lib\site-packages\scipy\ndimage\interpolation.py", line 437, in map_coordinates

raise RuntimeError('invalid shape for coordinate array')

Here is my definition of the interpolation function:

from scipy.ndimage import map_coordinates



def interpolator_3d(df, func_values):

    # The coordinates at which input is evaluated

    coordinates = df[['t_new', 'x_new', 'y_new']].values.T    # (3, 1273)

    # list of input array [[t0, x0, y0, value0], [t1, x1, y1, value1], ...]

    input_arr = func_values                                 # (1780020000, 4)

    return map_coordinates(input_arr, coordinates)

Solution

There are at least two issues with how you are using map_coordinates. Keep in mind that this function was designed for image resampling.

If you have a 3d-function the array input_arr should be 3-dimensional. map_coordinates will use the indices as t, x and y coordinates. The value v of the function has to be stored at each respective position. If your original function has another base grid, then you have to normalize everything accordingly to the arrays indices before and after. This requires an equidistant grid as input.
The coordinates have to be an array e.g. of the form [[t_new_0, t_new_1, ...], [x_new_0, x_new_1 ...], [y_new_0, y_new_1, ...]]. The result will be a list of interpolated samples [[v_new_0, v_new_1, ...]]. Generally, if input_array is n-dimensional, coordinates has to be a list that contains n arrays of same shape S. The result will be a list of arrays of shape S.

Example with n=3 dimensions and 5 samples to interpolate in a 1-dimensional shape:

import numpy as np
from scipy import ndimage

a = np.arange(64.).reshape((4, 4, 4))
print(a)

out = ndimage.map_coordinates(a, [
  [0.5, 1.0, 1.5, 2.0, 2.5], [0.1, 0.2, 0.3, 0.4, 0.5], [2.0, 1.9, 1.8, 1.7, 1.6]
])
print(out)

Output:

[[[ 0.  1.  2.  3.]
  [ 4.  5.  6.  7.]
  [ 8.  9. 10. 11.]
  [12. 13. 14. 15.]]
 [[16. 17. 18. 19.]
  [20. 21. 22. 23.]
  [24. 25. 26. 27.]
  [28. 29. 30. 31.]]
 [[32. 33. 34. 35.]
  [36. 37. 38. 39.]
  [40. 41. 42. 43.]
  [44. 45. 46. 47.]]
 [[48. 49. 50. 51.]
  [52. 53. 54. 55.]
  [56. 57. 58. 59.]
  [60. 61. 62. 63.]]]

[ 7.6688, 18.148 , 26.3424, 34.6304, 45.3904]

Update:

That means, if your input_array has the form [[t0, x0, y0, value0], [t1, x1, y1, value1], ...] with length 1780020000 = 19778 * 500 * 180 it has to be transformed accordingly to an array of shape (19778, 500, 180):

t_max, x_max, y_max, _ = np.max(func_values, axis=0).astype(int) + 1  # 19778, 500, 180
input_arr = np.zeros((t_max, x_max, y_max), dtype=float)

for t, x, y, v in func_values:
  input_arr[int(t), int(x), int(y)] = v