I have a dataset, which I am to interpolate.
Original dataset: a field with a graticule (latitude: 17, longitude: 13, step: 0.25x0.25 degrees) and 221 values within this graticule.
ds= xr.open_dataset('gfs.t00z.pgrb2.0p25.f000', engine='cfgrib')
print(ds['t2m'])
'''
Output:
<xarray.DataArray 't2m' (latitude: 17, longitude: 13)>
[221 values with dtype=float32]
Coordinates:
time datetime64[ns] ...
step timedelta64[ns] ...
heightAboveGround float64 ...
* latitude (latitude) float64 47.0 47.25 47.5 ... 50.5 50.75 51.0
* longitude (longitude) float64 1.0 1.25 1.5 1.75 ... 3.5 3.75 4.0
'''
I have to transform the field into a field with graticule of a different latitude/longitude step (1.9047x1.875 degrees):
ds_i = ds.interp(latitude=[48.5705, 50.4752],
longitude=[1.875, 3.75],
method="linear")
print(ds_i['t2m'])
'''
Output:
<xarray.DataArray 't2m' (latitude: 2, longitude: 2)>
array([[281.84174231, 284.01994458],
[281.00258201, 280.88313926]])
Coordinates:
time datetime64[ns] 2023-04-11
step timedelta64[ns] 00:00:00
heightAboveGround float64 2.0
valid_time datetime64[ns] 2023-04-11
* latitude (latitude) float64 48.57 50.48
* longitude (longitude) float64 1.875 3.75
'''
How do I display the original and interpolated datasets to compare them side by side and make sure I did everything right and achieved my goal?
Also, note that the interpolated coordinates are truncated (48.5705, 50.4752 compared to 48.57 50.48 in the output). Is there a way to keep the accuracy?
Here is a suggestion.
import xarray as xr
ds_raw = xr.tutorial.load_dataset("air_temperature")
ds = ds_raw.air.isel(time=0)
print(ds)
->
<xarray.DataArray 'air' (lat: 25, lon: 53)>
array([[241.2 , 242.5 , 243.5 , ..., 232.79999, 235.5 ,
238.59999],
[243.79999, 244.5 , 244.7 , ..., 232.79999, 235.29999,
239.29999],
[250. , 249.79999, 248.89 , ..., 233.2 , 236.39 ,
241.7 ],
...,
[296.6 , 296.19998, 296.4 , ..., 295.4 , 295.1 ,
294.69998],
[295.9 , 296.19998, 296.79 , ..., 295.9 , 295.9 ,
295.19998],
[296.29 , 296.79 , 297.1 , ..., 296.9 , 296.79 ,
296.6 ]], dtype=float32)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
time datetime64[ns] 2013-01-01
ds_interp = ds.interp(lat=[48.5705, 50.4752], lon=[301.875, 303.75], method="linear")
To compare the original data and the interpolated data, you could use pcolormesh
to plot the original data and then scatter
plot the new points that you interpolated to.
# plot
import matplotlib.pyplot as plt
vmin, vmax = ds.min(), ds.max()
# original grid
ds.plot.pcolormesh(vmin=vmin, vmax=vmax)
# stack to obtain 1D lon, 1D lat and 1D temperature
ds_interp_gridpoints = ds_interp.stack(gridpoint=["lon", "lat"])
# scatter plot inteprolated dataset
plt.scatter(
ds_interp_gridpoints.lon,
ds_interp_gridpoints.lat,
c=ds_interp_gridpoints,
vmin=vmin,
vmax=vmax,
s=100,
ec="k",
)
plt.xlim(290, 320)
plt.ylim(45, 55)
plt.show()
When printing a DataArray/ Dataset the coordinates are rounded automatically. The actual underlying data is not rounded, as can be seen by:
print(ds_interp.lat.values)
[48.5705 50.4752]