python dataset coordinates interpolation python-xarray

Python+xarray: Displaying datasets

I have a dataset, which I am to interpolate.

Original dataset: a field with a graticule (latitude: 17, longitude: 13, step: 0.25x0.25 degrees) and 221 values within this graticule.

ds= xr.open_dataset('gfs.t00z.pgrb2.0p25.f000', engine='cfgrib')
print(ds['t2m'])

'''
Output:
<xarray.DataArray 't2m' (latitude: 17, longitude: 13)>
[221 values with dtype=float32]
Coordinates:
    time               datetime64[ns] ...
    step               timedelta64[ns] ...
    heightAboveGround  float64 ...
  * latitude           (latitude) float64 47.0 47.25 47.5 ... 50.5 50.75 51.0
  * longitude          (longitude) float64 1.0 1.25 1.5 1.75 ... 3.5 3.75 4.0
'''

I have to transform the field into a field with graticule of a different latitude/longitude step (1.9047x1.875 degrees):

ds_i = ds.interp(latitude=[48.5705, 50.4752],
                 longitude=[1.875, 3.75],
                 method="linear")
print(ds_i['t2m'])

'''
Output:
<xarray.DataArray 't2m' (latitude: 2, longitude: 2)>
array([[281.84174231, 284.01994458],
       [281.00258201, 280.88313926]])
Coordinates:
    time               datetime64[ns] 2023-04-11
    step               timedelta64[ns] 00:00:00
    heightAboveGround  float64 2.0
    valid_time         datetime64[ns] 2023-04-11
  * latitude           (latitude) float64 48.57 50.48
  * longitude          (longitude) float64 1.875 3.75
'''

How do I display the original and interpolated datasets to compare them side by side and make sure I did everything right and achieved my goal?

Also, note that the interpolated coordinates are truncated (48.5705, 50.4752 compared to 48.57 50.48 in the output). Is there a way to keep the accuracy?

Update thanks to the answer-solution:

Solution

Here is a suggestion.

Create a dataset that looks like yours

import xarray as xr

ds_raw = xr.tutorial.load_dataset("air_temperature")
ds = ds_raw.air.isel(time=0)
print(ds)

<xarray.DataArray 'air' (lat: 25, lon: 53)>
array([[241.2    , 242.5    , 243.5    , ..., 232.79999, 235.5    ,
        238.59999],
       [243.79999, 244.5    , 244.7    , ..., 232.79999, 235.29999,
        239.29999],
       [250.     , 249.79999, 248.89   , ..., 233.2    , 236.39   ,
        241.7    ],
       ...,
       [296.6    , 296.19998, 296.4    , ..., 295.4    , 295.1    ,
        294.69998],
       [295.9    , 296.19998, 296.79   , ..., 295.9    , 295.9    ,
        295.19998],
       [296.29   , 296.79   , 297.1    , ..., 296.9    , 296.79   ,
        296.6    ]], dtype=float32)
Coordinates:
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
    time     datetime64[ns] 2013-01-01

Interpolate, just as you did

ds_interp = ds.interp(lat=[48.5705, 50.4752], lon=[301.875, 303.75], method="linear")

Plot

To compare the original data and the interpolated data, you could use pcolormesh to plot the original data and then scatter plot the new points that you interpolated to.

# plot
import matplotlib.pyplot as plt

vmin, vmax = ds.min(), ds.max()

# original grid
ds.plot.pcolormesh(vmin=vmin, vmax=vmax)

# stack to obtain 1D lon, 1D lat and 1D temperature
ds_interp_gridpoints = ds_interp.stack(gridpoint=["lon", "lat"])

# scatter plot inteprolated dataset
plt.scatter(
    ds_interp_gridpoints.lon,
    ds_interp_gridpoints.lat,
    c=ds_interp_gridpoints,
    vmin=vmin,
    vmax=vmax,
    s=100,
    ec="k",
)

plt.xlim(290, 320)
plt.ylim(45, 55)
plt.show()

Accuracy of coordinates

When printing a DataArray/ Dataset the coordinates are rounded automatically. The actual underlying data is not rounded, as can be seen by:

print(ds_interp.lat.values)

[48.5705 50.4752]