Search code examples
pythonnumpyscipyinterpolationextrapolation

Extrapolate with LinearNDInterpolator


I've got a 3D dataset that I want to interpolate AND extrapolate linearly. The interpolation can be done easily with scipy.interpolate.LinearNDInterpolator. The module can only fill in a constant/nan for values outside the parameter range, but I don't see why it would not offer an option to turn on extrapolation.

Looking at the code, I see that the module is written in cython. With no experience in cython, it is hard to play around with the code to implement extrapolation. I can write it in pure python code, but maybe someone else here has a better idea? My particular case involves a constant xy-grid, but the z-values keep changing a lot (-100,000) and therefore the interpolation must be fast as the interpolation will be runned for each time the z-values change.

To give a basic example, as requested, lets say that I have a grid like

xyPairs = [[-1.0, 0.0], [-1.0, 4.0],
           [-0.5, 0.0], [-0.5, 4.0],
           [-0.3, 0.0], [-0.3, 4.0],
           [+0.0, 0.0], [+0.0, 4.0],
           [+0.2, 0.0], [+0.2, 4.0]]

and lets say I want to calculate values at x = -1.5, -0.8, +0.5 and y = -0.2, +0.2, +0.5. Currently, I am performing 1d interpolation/extrapolation along the x-axis for each y-value and then along the y-axis for each x-value. The extrapolation is done by the second function in ryggyr's answer.


Solution

  • I propose a method, the code is awful but I hope it will help you. The idea is, if you know by advance the bounds in which you will have to extrapolate, you can add extra columns/rows at the edge of your arrays with linearly extrapolated values, and then interpolate on the new array. Here is an example with some data that will be extrapolated until x=+-50 and y=+-40:

    import numpy as np
    x,y=np.meshgrid(np.linspace(0,6,7),np.linspace(0,8,9)) # create x,y grid
    z=x**2*y # and z values
    # create larger versions with two more columns/rows
    xlarge=np.zeros((x.shape[0]+2,x.shape[1]+2))
    ylarge=np.zeros((x.shape[0]+2,x.shape[1]+2))
    zlarge=np.zeros((x.shape[0]+2,x.shape[1]+2))
    xlarge[1:-1,1:-1]=x # copy data on centre
    ylarge[1:-1,1:-1]=y
    zlarge[1:-1,1:-1]=z
    # fill extra columns/rows
    xmin,xmax=-50,50
    ymin,ymax=-40,40
    xlarge[:,0]=xmin;xlarge[:,-1]=xmax # fill first/last column
    xlarge[0,:]=xlarge[1,:];xlarge[-1,:]=xlarge[-2,:] # copy first/last row
    ylarge[0,:]=ymin;ylarge[-1,:]=ymax
    ylarge[:,0]=ylarge[:,1];ylarge[:,-1]=ylarge[:,-2]
    # for speed gain: store factor of first/last column/row
    first_column_factor=(xlarge[:,0]-xlarge[:,1])/(xlarge[:,1]-xlarge[:,2]) 
    last_column_factor=(xlarge[:,-1]-xlarge[:,-2])/(xlarge[:,-2]-xlarge[:,-3])
    first_row_factor=(ylarge[0,:]-ylarge[1,:])/(ylarge[1,:]-ylarge[2,:])
    last_row_factor=(ylarge[-1,:]-ylarge[-2,:])/(ylarge[-2,:]-ylarge[-3,:])
    # extrapolate z; this operation only needs to be repeated when zlarge[1:-1,1:-1] is updated
    zlarge[:,0]=zlarge[:,1]+first_column_factor*(zlarge[:,1]-zlarge[:,2]) # extrapolate first column
    zlarge[:,-1]=zlarge[:,-2]+last_column_factor*(zlarge[:,-2]-zlarge[:,-3]) # extrapolate last column
    zlarge[0,:]=zlarge[1,:]+first_row_factor*(zlarge[1,:]-zlarge[2,:]) # extrapolate first row
    zlarge[-1,:]=zlarge[-2,:]+last_row_factor*(zlarge[-2,:]-zlarge[-3,:]) #extrapolate last row
    

    Then you can interpolate on (xlarge,ylarge,zlarge). Since all operations are numpy slices operations, I hope it will be fast enough for you. When z data are updated, copy them in zlarge[1:-1,1:-1] and re-execute the 4 last lines.