I am using xarray/rasterio to do some operations on a set of GeoTiff files. The data that I am dealing with contain missing values (-9999) at some grid points. Is there any efficient pythonic way to replace these values with interpolated data using xarray?
For example I have something like:
>>> da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
>>> da
<xarray.DataArray (x: 3, y: 3)>
array([[ 1., 2., 3.],
[ 4., -9999., 6.],
[ 7., -9999., 9.]])
Coordinates:
* x (x) int64 1 2 3
* y (y) int64 1 2 3
and I want to replace -9999 values with the interpolated values based on adjacent grid points. Any hint would be appreciated!
You can use da.interpolate_na()
to do one dimensional interpolation of NA values - I am unsure if there is a two dimensional method.
You can convert your -999 values to NaN's and then interpolate.
da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
da
<xarray.DataArray (x: 3, y: 3)>
array([[ 1., 2., 3.],
[ 4., -999., 6.],
[ 7., -999., 9.]])
Coordinates:
* x (x) int64 1 2 3
* y (y) int64 1 2 3
new_da = da.where(da != -999.0, np.nan)
new_da
<xarray.DataArray (x: 3, y: 3)>
array([[ 1., 2., 3.],
[ 4., nan, 6.],
[ 7., nan, 9.]])
Coordinates:
* x (x) int64 1 2 3
* y (y) int64 1 2 3
new_da.interpolate_na(dim=('y'), method='linear')
<xarray.DataArray (x: 3, y: 3)>
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]])
Coordinates:
* x (x) int64 1 2 3
* y (y) int64 1 2 3
Since it's a 1-D interpolation, you can just interpolate for each dimension separately.