Search code examples
dasknetcdfpython-xarraynoaa

Trying to open a series of netCDF files with using OpenDAP


I want to open all the data from 1950-2005 using xarray and open_mfdataset. https://www.esrl.noaa.gov/psd/thredds/catalog/Datasets/ncep.reanalysis/surface/catalog.html

This is what I have done so far:

source=https://www.esrl.noaa.gov/psd/thredds/catalog/Datasets/ncep.reanalysis/surface/air.sig995.years.nc

    files = [source for years in range(1950,2005,1)]
    ds=xr.open_mfdataset(files)
    print(ds)

However, I cannot seem to get my list interpreted to be read into the variable years within source.

Any ideas?

Thank you in advance.

EDIT: path = 'https://www.esrl.noaa.gov/psd/thredds/catalog/Datasets/ncep.reanalysis/surface' files = ['{0}/air.sig995.{1:04d}.nc'.format(path, years) for years in range(1950,2005,1)] print(files) nc = netCDF4.MFDataset(files)

This is the code I am using. When I try to open up these files I get an error:

OSError: [Errno -90] NetCDF: file not found: b'https://www.esrl.noaa.gov/psd/thredds/catalog/Datasets/ncep.reanalysis/surface/air.sig995.1948.nc'

Did I not enter the path correctly?


Solution

  • All files are named air.sig995.YYYY.nc, so you need something like:

    files = ['air.sig995.{0:04d}.nc'.format(years) for years in range(1950,2005,1)]
    

    Which produces:

    In [2]: files
    Out[2]: 
    ['air.sig995.1950.nc',
     'air.sig995.1951.nc',
     'air.sig995.1952.nc',
     'air.sig995.1953.nc',
     .....
    

    You can also easily include a (remote) path here (if required):

    path = '/some/file/path'
    files = ['{0}/air.sig995.{1:04d}.nc'.format(path, years) for years in range(1950,2005,1)]
    

    See https://pyformat.info/ for more information on string formatting in Python.