Search code examples
pythonnetcdfgdalgeotiff

Python: how to merge two different netCdf filles with different spatial resolution?


Is it possible to merge two netCDF files with different spatial resolution?

I have two datasets.

The first one is the ESA Land Cover dataset with a spatial resoltion of 300m as netCDF.

The first one is the population living in Italy with a spatial resolution of 100m from WorldPop as a geoTIFF that I convert as netCDF.

This is what I am doing

## convert GeoTiff to netCDF
from osgeo import gdal
fin = ita_ppp_2000.tif'
fout = 'ita_ppp_2000.nc'
ds = gdal.Translate(fout, fin, format='NetCDF')

I download the ESA CCI data

year = 2000
import cdsapi
c.retrieve(
    'satellite-land-cover',
    {
        'variable': 'all',
        'format': 'zip',
        'version': 'v2.1.1',
        'year': str(y),
    },
    'download_%d.zip'%y) ## we need to unzip it

fn = 'ESACCI-LC-L4-LCCS-Map-300m-P1Y-%d-v2.0.7cds.nc'%year## Global 300m resolution

I get the data for Italy

def clipEsa(fn,x0,x1,y0,y1):
    dnc = xr.open_dataset(fn)
    lat = dnc.variables['lat'][:]
    lon = dnc.variables['lon'][:]

    # All indices in bounding box:
    where_j = np.where((lon >= x0) & (lon <= x1))[0]
    where_i = np.where((lat >= y0) & (lat <= y1))[0]

    # Start and end+1 indices in each dimension:
    i0 = where_i[0]
    i1 = where_i[-1]+1

    j0 = where_j[0]
    j1 = where_j[-1]+1

    longitude = dnc["lccs_class"]["lon"].values[j0:j1]
    latitude = dnc["lccs_class"]["lat"].values[i0:i1]
    
    time = dnc['lccs_class']['time'][0]
    return dnc.sel(time=time, lon=longitude, lat=latitude) 

  wp = xr.open_dataset(fout)  ## Italian population with 100m resolution
  bb = [wp.lon[0].values, wp.lon[-1].values, wp.lat[0].values, wp.lat[-1].values] ## bounding box
  esaItaly = clipEsa(fn,bb[0],bb[1],bb[2],bb[3])  ## ESA CCI clipped for Italy

New I would like to have both the datasets at the spatial resolution of 300m. In particular I would like to resample with a sum the wp dataset from 100m to 300m in the same pixels of esaItaly

This is what I tried

wp_inter = wp.interp(lat=esaItaly["lat"], lon=esaItaly["lon"])

but the total amount of population is much lower.

sum(wp_inter['Band1'].values[wp_inter['Band1'].values>0])
5038174.5   ## population interpolated

sum(wp.Band1.values[wp.Band1.values>0])
56780870.0   ## original population

Solution

  • There are many ways to do it, but probably the easiest one is by gdalbuildvrt.

    Use gdalbuildvrt - either from the command-line either from the Python library - and build a VRT dataset. Make sure the highest resolution files are listed towards the end - if there is overlapping the final dataset wins.

    Remember to use [-resolution {highest|lowest|average|user}] option.

    Once you have a composite Dataset, use gdal_translate - CLI or Python - to consolidate it to a single monolithic Dataset in your preferred format.

    Don't try to implement this yourself - it is more complicated than it might seem.