Search code examples
python-3.xinterpolationpython-xarrayrasterio

Interpolate dataArray missing data using xarray


I am using xarray/rasterio to do some operations on a set of GeoTiff files. The data that I am dealing with contain missing values (-9999) at some grid points. Is there any efficient pythonic way to replace these values with interpolated data using xarray?

For example I have something like:

>>> da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
>>> da
<xarray.DataArray (x: 3, y: 3)>
array([[   1.,    2.,    3.],
       [   4., -9999.,    6.],
       [   7., -9999.,    9.]])
Coordinates:
  * x        (x) int64 1 2 3
  * y        (y) int64 1 2 3

and I want to replace -9999 values with the interpolated values based on adjacent grid points. Any hint would be appreciated!


Solution

  • You can use da.interpolate_na() to do one dimensional interpolation of NA values - I am unsure if there is a two dimensional method.

    You can convert your -999 values to NaN's and then interpolate.

    da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
    da
    
    <xarray.DataArray (x: 3, y: 3)>
    array([[   1.,    2.,    3.],
           [   4., -999.,    6.],
           [   7., -999.,    9.]])
    Coordinates:
      * x        (x) int64 1 2 3
      * y        (y) int64 1 2 3
    
    new_da = da.where(da != -999.0, np.nan)
    new_da
    
    <xarray.DataArray (x: 3, y: 3)>
    array([[ 1.,  2.,  3.],
           [ 4., nan,  6.],
           [ 7., nan,  9.]])
    Coordinates:
      * x        (x) int64 1 2 3
      * y        (y) int64 1 2 3
    
    new_da.interpolate_na(dim=('y'), method='linear')
    
    
    <xarray.DataArray (x: 3, y: 3)>
    array([[1., 2., 3.],
           [4., 5., 6.],
           [7., 8., 9.]])
    Coordinates:
      * x        (x) int64 1 2 3
      * y        (y) int64 1 2 3
    

    Since it's a 1-D interpolation, you can just interpolate for each dimension separately.