Search code examples
pythonmatplotlibpython-xarraynetcdf4

Plotting netCDF with xarray, data not showing but legend is


I am trying to use the simple .plot() function from xarray on a netCDF file from the atmospheric composition analysis group.

Say I want to plot the PM2.5 concentration from North America in 2000 available here.

When I try to plot the dataset I get an empty figure even though the data exists as shown by the legend bar.

import xarray as xr
import netCDF4 as nc
import matplotlib.pyplot as plt

path_to_nc="my/path/file.nc"

ds=xr.open_dataset(path_to_nc)
print(ds)
>>>

<xarray.Dataset>
Dimensions:  (LAT: 4550, LON: 9300)
Coordinates:
  * LON      (LON) float64 -138.0 -138.0 -138.0 -138.0 ... -45.03 -45.01 -45.01
  * LAT      (LAT) float64 68.0 67.99 67.97 67.96 ... 22.53 22.52 22.51 22.5
Data variables:
    PM25     (LAT, LON) float32 ...

The file does have values (not just nan).

# Range of values:
ds=ds['PM25']
print(ds)
>>>

<xarray.DataArray 'PM25' (LAT: 4550, LON: 9300)>
array([[1.6, 1.6, 1.6, ..., 1.2, 1.2, 1.2],
       [1.6, 1.6, 1.6, ..., 1.2, 1.2, 1.2],
       [1.6, 1.6, 1.6, ..., 1.2, 1.2, 1.2],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
  * LON      (LON) float64 -138.0 -138.0 -138.0 -138.0 ... -45.03 -45.01 -45.01
  * LAT      (LAT) float64 68.0 67.99 67.97 67.96 ... 22.53 22.52 22.51 22.5
Attributes:
    standard_name:  PM25
    units:          ug/m3

But if I try to plot the values then I get an empty ax.

ds.plot()

Output


Solution

  • The problem is that you are trying to plot too many data all together. If you just select a subset of them it works:

    #select data
    dssel=ds.where((-125 < ds.LON) & (ds.LON < -115)
             & (49 < ds.LAT) & (ds.LAT < 55), drop=True)
    #plot PM2.5
    plt.figure()
    dssel.PM25.plot()
    

    Here the result: enter image description here

    Interestingly if you plot the data using matplotlib directly it is much faster and I'm able to plot the entire dataset (on my 4 years old and not particularly fast laptop it takes about 20 seconds). I used netCDF4 library to load the PM2.5 dataset in this case.

    from netCDF4 import Dataset
    nc_fid = Dataset(fpath, 'r')
    
    lats = nc_fid.variables['LAT'][:]  # extract/copy the data
    lons = nc_fid.variables['LON'][:]
    PM25 = nc_fid.variables['PM25'][:]
    
    fig, axs = plt.subplots(figsize=(15, 10), nrows=2,ncols=1,gridspec_kw={'height_ratios': [20,1.5]},constrained_layout=True)
    pcm=axs[0].pcolormesh(lons,lats,PM25,cmap='viridis')
    cbar=fig.colorbar(pcm,cax=axs[1], extend='both', orientation='horizontal')
    cbar.set_label('PM 2.5 [$\mu$g m$^{-3}]$')
    

    enter image description here