Search code examples
pythondatasetcoordinatesinterpolationpython-xarray

Python+xarray: Displaying datasets


I have a dataset, which I am to interpolate.

Original dataset: a field with a graticule (latitude: 17, longitude: 13, step: 0.25x0.25 degrees) and 221 values within this graticule.

ds= xr.open_dataset('gfs.t00z.pgrb2.0p25.f000', engine='cfgrib')
print(ds['t2m'])

'''
Output:
<xarray.DataArray 't2m' (latitude: 17, longitude: 13)>
[221 values with dtype=float32]
Coordinates:
    time               datetime64[ns] ...
    step               timedelta64[ns] ...
    heightAboveGround  float64 ...
  * latitude           (latitude) float64 47.0 47.25 47.5 ... 50.5 50.75 51.0
  * longitude          (longitude) float64 1.0 1.25 1.5 1.75 ... 3.5 3.75 4.0
'''

I have to transform the field into a field with graticule of a different latitude/longitude step (1.9047x1.875 degrees):

ds_i = ds.interp(latitude=[48.5705, 50.4752],
                 longitude=[1.875, 3.75],
                 method="linear")
print(ds_i['t2m'])

'''
Output:
<xarray.DataArray 't2m' (latitude: 2, longitude: 2)>
array([[281.84174231, 284.01994458],
       [281.00258201, 280.88313926]])
Coordinates:
    time               datetime64[ns] 2023-04-11
    step               timedelta64[ns] 00:00:00
    heightAboveGround  float64 2.0
    valid_time         datetime64[ns] 2023-04-11
  * latitude           (latitude) float64 48.57 50.48
  * longitude          (longitude) float64 1.875 3.75
'''

How do I display the original and interpolated datasets to compare them side by side and make sure I did everything right and achieved my goal?

Also, note that the interpolated coordinates are truncated (48.5705, 50.4752 compared to 48.57 50.48 in the output). Is there a way to keep the accuracy?

Update thanks to the answer-solution: enter image description here


Solution

  • Here is a suggestion.

    Create a dataset that looks like yours

    import xarray as xr
    
    ds_raw = xr.tutorial.load_dataset("air_temperature")
    ds = ds_raw.air.isel(time=0)
    print(ds)
    

    ->

    <xarray.DataArray 'air' (lat: 25, lon: 53)>
    array([[241.2    , 242.5    , 243.5    , ..., 232.79999, 235.5    ,
            238.59999],
           [243.79999, 244.5    , 244.7    , ..., 232.79999, 235.29999,
            239.29999],
           [250.     , 249.79999, 248.89   , ..., 233.2    , 236.39   ,
            241.7    ],
           ...,
           [296.6    , 296.19998, 296.4    , ..., 295.4    , 295.1    ,
            294.69998],
           [295.9    , 296.19998, 296.79   , ..., 295.9    , 295.9    ,
            295.19998],
           [296.29   , 296.79   , 297.1    , ..., 296.9    , 296.79   ,
            296.6    ]], dtype=float32)
    Coordinates:
      * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
      * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
        time     datetime64[ns] 2013-01-01
    

    Interpolate, just as you did

    ds_interp = ds.interp(lat=[48.5705, 50.4752], lon=[301.875, 303.75], method="linear")
    

    Plot

    To compare the original data and the interpolated data, you could use pcolormesh to plot the original data and then scatter plot the new points that you interpolated to.

    # plot
    import matplotlib.pyplot as plt
    
    vmin, vmax = ds.min(), ds.max()
    
    # original grid
    ds.plot.pcolormesh(vmin=vmin, vmax=vmax)
    
    # stack to obtain 1D lon, 1D lat and 1D temperature
    ds_interp_gridpoints = ds_interp.stack(gridpoint=["lon", "lat"])
    
    # scatter plot inteprolated dataset
    plt.scatter(
        ds_interp_gridpoints.lon,
        ds_interp_gridpoints.lat,
        c=ds_interp_gridpoints,
        vmin=vmin,
        vmax=vmax,
        s=100,
        ec="k",
    )
    
    plt.xlim(290, 320)
    plt.ylim(45, 55)
    plt.show()
    

    original data plus interpolated data

    Accuracy of coordinates

    When printing a DataArray/ Dataset the coordinates are rounded automatically. The actual underlying data is not rounded, as can be seen by:

    print(ds_interp.lat.values)
    
    [48.5705 50.4752]