Search code examples
numpymultidimensional-arrayinterpolation

Implementing 1D interpolation on a 3D Array in Numpy or Xarray


I have been attempting to find a way to utilize np.interp() to interpolate the innermost dimension of a 3D array:

Say I have a 3D array with shape: (lat,lon,depth) = (3,3,4) as a toy example:

soil_temp = np.asarray([[[ 7.720276, 7.1702576, 4.821167, -1.7201233 ],
              [ 8.950958, 7.4095764, 1.5319519, -0.59176636],
              [ 4.1655273, 3.452118, 0.5239258, -3.0393372 ]],
             [[ 9.18335, 8.561096, 6.048462, -1.2128601 ],
              [ 8.009003, 7.6471863, 6.0042725, -0.17993164],
              [ 5.075775, 4.0643005, 0.55792236, -2.9087524 ]],
             [[13.139526, 12.393707, 9.50769, 0.5014343 ],
             [10.899994, 8.8767395, 1.443573, -0.96343994],
             [9.239685, 7.3506165, 0.4899292, -0.33395386]]])

My goal is to estimate the active layer thickness (the depth at which the temperature reaches 0 degrees). I know how to do this for a singular array:

depths = np.asarray([3.5,17.5,64,194.5])
stemps = np.asarray([7.720276,7.1702576,4.821167,-1.7201233])

thaw_point=[0]

order = stemps.argsort() #sort into monotonically increasing values

y_data = stemps[order] #sort into monotonically increasing values
x_data = depths[order] #sort into monotonically increasing values

thaw_point=0

ALT = np.interp(thaw_point,y_data,x_data) 

Which gives me ALT = 160.183cm

However I have been having a tough time figuring out how to apply np.interp() to the 3D array in a computationally efficient manner.

I can do it in a loop structure, which would be fine for this small example, but not for a much larger array:

def ALT_interp(data):
    thaw_point = 0
    array_shape = data.shape
    print(array_shape)
    deps = depths
    ALT_array=np.empty([array_shape[0],array_shape[1],1])
    for idx in range(0,array_shape[0]):
        for idx2 in range(0, array_shape[1]):
            stemps = data[idx,idx2,:]
            order = stemps.argsort()   
            y_data = stemps[order]
            x_data = depths[order]
            ALT = np.interp(thaw_point,y_data,x_data)
            ALT_array[idx,idx2,0] = ALT

    return ALT_array
    
depths = np.asarray([3.5,17.5,64,194.5])

soil_temp = np.asarray([[[ 7.720276, 7.1702576, 4.821167, -1.7201233 ],
              [ 8.950958, 7.4095764, 1.5319519, -0.59176636],
              [ 4.1655273, 3.452118, 0.5239258, -3.0393372 ]],
             [[ 9.18335, 8.561096, 6.048462, -1.2128601 ],
              [ 8.009003, 7.6471863, 6.0042725, -0.17993164],
              [ 5.075775, 4.0643005, 0.55792236, -2.9087524 ]],
             [[13.139526, 12.393707, 9.50769, 0.5014343 ],
             [10.899994, 8.8767395, 1.443573, -0.96343994],
             [9.239685, 7.3506165, 0.4899292, -0.33395386]]])
   
ALT_values = ALT_interp(soil_temp)

But is there a more computationally efficient way to do this (since I am hoping to scale this up to a 1801x3600x4 array)?


Solution

  • Neither the NumPy or SciPy 1D interpolators seem to be designed for this. They seem to expect the x-coordinates to be a 1D array, but you want the x-coordinates to be multidimensional.

    One approach would be to use one of the SciPy interpolators (e.g. PchipInterpolator) with depths as the x-data and soil_temp as the y-data, then use the roots method to find the root.

    However, since you seem to be happy with linear interpolation, you might as well do linear interpolation manually. It appears that the points to interpolate between are the last two in each row, so:

    import numpy as np
    from scipy import interpolate
    
    depths = np.asarray([3.5,17.5,64,194.5])
    
    soil_temp = np.asarray([[[ 7.720276, 7.1702576, 4.821167, -1.7201233 ],
                  [ 8.950958, 7.4095764, 1.5319519, -0.59176636],
                  [ 4.1655273, 3.452118, 0.5239258, -3.0393372 ]],
                 [[ 9.18335, 8.561096, 6.048462, -1.2128601 ],
                  [ 8.009003, 7.6471863, 6.0042725, -0.17993164],
                  [ 5.075775, 4.0643005, 0.55792236, -2.9087524 ]],
                 [[13.139526, 12.393707, 9.50769, 0.5014343 ],
                 [10.899994, 8.8767395, 1.443573, -0.96343994],
                 [9.239685, 7.3506165, 0.4899292, -0.33395386]]])
    
    # In your data, the zero-crossing appears to occur
    # between or beyond the last two data points. If
    # that is not true in general, you might have to 
    # find the points to interpolate between. If you
    # have trouble with that and need to a ask a question,
    # that would be a separate question with a separate
    # answer.
    x1 = depths[-2]
    x2 = depths[-1]
    y1 = soil_temp[:, :, -2]
    y2 = soil_temp[:, :, -1]
    thaw_point=0
    
    # interpolate, but flip the roles of x and y
    roots = x1 + (x2 - x1)/(y2 - y1) * (thaw_point - y1)
    
    # compare against your solution
    roots2 = np.empty((3, 3))
    for i in range(3):
        for j in range(3):
            stemps = soil_temp[i, j]
            order = stemps.argsort() #sort into monotonically increasing values
            y_data = stemps[order] #sort into monotonically increasing values
            x_data = depths[order] #sort into monotonically increasing values
            thaw_point=0
            roots2[i, j] = np.interp(thaw_point, y_data, x_data) 
    
    np.testing.assert_allclose(roots, roots2)
    

    The one failure is for the row that doesn't have a zero crossing in the data. Your np.interp solution did not extrapolate, so it is returning the maximum depth in the data (194.5), but it seems that the real "active layer thickness" would be higher than that. The manual interpolation does that extrapolation.


    If the zero crossing is not always between the last two elements of each row (or beyond that), replace definition of x1, x2, y1, y2 with:

    # broadcast depths to the same shape as soil_temp
    depths3d = np.broadcast_to(depths, soil_temp.shape)
    
    # Assume that we will interpolate between the last two elements.
    # We need these to be writeable, and you probably don't want the original
    # data modified, so make copies.
    x1 = depths3d[:, :, -2].copy()
    x2 = depths3d[:, :, -1].copy()
    y1 = soil_temp[:, :, -2].copy()
    y2 = soil_temp[:, :, -1].copy()
    
    # Find the indices at which soil_temp changes sign
    i1, i2, i3 = np.nonzero(np.diff(soil_temp < 0))
    
    # Update our assumptions about the interpolation points.
    # If no zero crossing is found within a row, the points
    # are not updated. I'm assuming this occurs when all the
    # soil temps along a row are positive, which means that
    # you want to extrapolate from the last two observations,
    # as we assumed before. If all the soil temps along the
    # row are *negative*, you'd want to extrapolate from the
    # left side. This is left as an exercise for the reader.
    x1[i1, i2] = depths3d[i1, i2, i3]
    x2[i1, i2] = depths3d[i1, i2, i3 + 1]
    y1[i1, i2] = soil_temp[i1, i2, i3]
    y2[i1, i2] = soil_temp[i1, i2, i3 + 1]