Search code examples
python-2.7numpyscipynetcdf

high resolution Reanalysis data


When I extract data from a netCDF file Reanalysis (variable pressure (SLP), 01/01/2014) the data is very high resolution (9km grid) which makes the resulting image quite noisy. I would like to put the data into a lower resolution grid (e.g. 1 degree). I'm trying to use the functions meshgrid and gridata, but inexperience am unable to make it work. Does anyone know how to solve? Thank you.

    from netCDF4 import Dataset
    import numpy as np
    from scipy.interpolate import griddata
    file = Dataset('slp_2014_01_01.nc', 'r')
    # Printing variables
    print ' '
    print ' '
    print '----------------------------------------------------------'
    for i,variable in enumerate(file.variables):
        print '   '+str(i),variable
        if i == 2:
            current_variable = variable
    print ' '
    print 'Variable: ', current_variable.upper()
    print 'File name:  ', file_name
    lat  = file.variables['lat'][:]
    lon  = file.variables['lon'][:]
    slp = file.variables['slp'][:]

    lon_i = np.linspace(lon[0], lon[len(REANALYSIS_lon)-1], num=len(lon)*2, endpoint=True, retstep=False)    
    lat_i = np.linspace(lat[0], lat[len(lat)-1], num=len(lat)*2, endpoint=True, retstep=False)

    lon_grid, lat_grid = np.meshgrid(lon_i,lat_i)

    temp_slp = np.asarray(slp).squeeze()
    new_slp = temp_slp.reshape(temp_slp.size)

    slp_grid = griddata((lon, lat), new_slp, (lon_grid, lat_grid),method='cubic')

As I mentioned, I tried to use the meshgrid and datagrid functions, but produced the following error:

Traceback (most recent call last): File "REANALYSIS_LOCAL.py", line 346, in
lon,lat,time,var,variavel_atual=netCDF_builder_local(caminho_netcdf_local,nome_arquivo,dt) File "REANALYSIS_LOCAL.py", line 143, in netCDF_builder_local
slp_grid = griddata((lon, lat), new_slp, (lon_grid, lat_grid),method='cubic')
File "/home/carlos/anaconda/lib/python2.7/site-packages/scipy/interpolate/ndgriddata.py", line 182, in griddata points = _ndim_coords_from_arrays(points)
File "interpnd.pyx", line 176, in scipy.interpolate.interpnd._ndim_coords_from_arrays (scipy/interpolate/interpnd.c:4064)
File "/home/carlos/anaconda/lib/python2.7/site-packages/numpy/lib/stride_tricks.py", line 101, in broadcast_arrays "incompatible dimensions on axis %r." % (axis,))
ValueError: shape mismatch: two or more arrays have incompatible dimensions on axis 0.

The dimensions of variables are:
lon: (144,)
lat: (73,)
lon_i: (288,)
lat_i: (146,)
lon_grid: (146, 288)
lat_grid: (146, 288)
new_slp: (10512,)

The values in new_slp are: new_slp: [ 102485. 102485. 102485. ..., 100710. 100710. 100710.]

The purpose is increase the values in the variables (lon, lat and slp), because the Reanalysis resolution is highter. Then, the resolution could be most detailed (more points).

For example: the variable lat have the points:

Original dimension variable lat: (73,)
lat: [ 90. 87.5 85. 82.5 80. 77.5 75. 72.5 70. 67.5 65. 62.5 60. 57.5 55. 52.5 50. 47.5 45. 42.5 40. 37.5 35. 32.5 30. 27.5 25. 22.5 20. 17.5 15. 12.5 10. 7.5 5. 2.5 0. -2.5 -5. -7.5 -10. -12.5 -15. -17.5 -20. -22.5 -25. -27.5 -30. -32.5 -35. -37.5 -40. -42.5 -45. -47.5 -50. -52.5 -55. -57.5 -60. -62.5 -65. -67.5 -70. -72.5 -75. -77.5 -80. -82.5 -85. -87.5 -90. ]

When I define the code line: lat_i = np.linspace(lat[0], lat[len(lat)-1], num=len(lat)*2, endpoint=True, retstep=False) I doubled the values of the lat variable la_i(146,)

lat _i: [ 90. 88.75862069 87.51724138 86.27586207 85.03448276 83.79310345 82.55172414 81.31034483 80.06896552 78.82758621 77.5862069
...
-78.82758621 -80.06896552 -81.31034483 -82.55172414 -83.79310345 -85.03448276 -86.27586207 -87.51724138 -88.75862069 -90. ]

The idea that I need is the same is available in this code, where x is lon, y is lat and slp is z.
from scipy.interpolate import griddata import numpy as np import matplotlib.pyplot as plt

x=np.linspace(1.,10.,20)
y=np.linspace(1.,10.,20)
z=z = np.random.random(20)
xi=np.linspace(1.,10.,40)
yi=np.linspace(1.,10.,40)

X,Y= np.meshgrid(xi,yi)

Z = griddata((x, y), z, (X, Y),method='nearest')

plt.contourf(X,Y,Z)

Solution

  • Depending on your final purpose, You may use cdo to regrid the whole file

    cdo remapbil,r360x180 infile outfile
    

    or just plot every second or third value from original file like this:

    plt.pcolormesh(lon[::2,::2],lat[::2,::2],var1[::2,::2])
    

    The error message You show just says that dimensions do not much, just print the shape of your variables before the error appears and try to get it working.

    Why Your code does not work? Your chosen method requires input coordinates as lon,lat pairs for data points, not mesh coordinates. If You have data points with shape 10000, your coordinates must be with the shape (10000,2), not (100,100). But as griddata is meant for unstructured data, it will not be efficient for Your purpose, I suggest using something like scipy.interpolate.RegularGridInterpolator

    But anyway, if You need to use the interpolated data more than once, I suggest creating new netCDF files with cdo and process them, instead of interpolating data each time You run Your script.