Search code examples
pythongeospatialh5pycartopysatellite-image

Plotting tiles of nighttime lights from multiple h5 files into one big map


I am trying to create a map of nighttime lights for Europe. Data source is NASA and I am using their VNP46A1 data for 2022/03/27 (you can find the exact files in my Dropbox, too). For downloading the data, I select the whole EU region which results in about 14 separate files or "tiles". I managed to read in and plot the files individually, but I don't know how to put them together to get one big map of the whole European region. I'm not sure what keywords I should search for because I'm totally unfamiliar with this kind of file format and plotting something like this. Below is my working code for plotting a single file (adapted from this Medium article).

import numpy as np
import matplotlib.pyplot as plt
import h5py
import cartopy.crs as ccrs
import os
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
from mpl_toolkits.axes_grid1 import make_axes_locatable
import matplotlib.ticker as mticker
import glob

indir = 'D:/PYTHON/NASA/VNP46A1.A2023058.h17v03.001.2023060025350.h5'
data = h5py.File(indir)
list(data.keys())
dnb = data['HDFEOS/GRIDS/VNP_Grid_DNB/Data Fields/DNB_At_Sensor_Radiance_500m']
dnb = np.array(dnb)
dnb = np.where(dnb==65535, np.nan, dnb)
# np.nanmax(dnb)

lat_max = data['HDFEOS/GRIDS/VNP_Grid_DNB'].attrs['NorthBoundingCoord']
lat_min = data['HDFEOS/GRIDS/VNP_Grid_DNB'].attrs['SouthBoundingCoord']
lon_min = data['HDFEOS/GRIDS/VNP_Grid_DNB'].attrs['WestBoundingCoord']
lon_max = data['HDFEOS/GRIDS/VNP_Grid_DNB'].attrs['EastBoundingCoord']

lats = np.linspace(lat_max, lat_min, dnb.shape[1])
lons = np.linspace(lon_min, lon_max, dnb.shape[0])
x, y = np.meshgrid(lons, lats)

cmap = plt.get_cmap('cividis')
crs = ccrs.PlateCarree()

fig =  plt.figure(figsize=(6,6))
ax = plt.axes(projection=crs)
ax.coastlines(resolution='10m', color='white',alpha=0.5)
ax.pcolormesh(x, y, dnb, cmap = cmap, vmin = 0, vmax = 100)

plt.xlim()

gl = ax.gridlines(crs=crs, draw_labels=True, alpha=0.5)
gl.xlabels_top = None
gl.ylabels_left = None
xgrid = np.arange(lon_min-2, lon_max+2, 2.)
ygrid = np.arange(lat_min-2, lat_max+2, 2.)
gl.xlocator = mticker.FixedLocator(xgrid.tolist())
gl.ylocator = mticker.FixedLocator(ygrid.tolist())
gl.xformatter = LONGITUDE_FORMATTER
gl.yformatter = LATITUDE_FORMATTER
gl.xlabel_style = {'size': 10, 'color': 'black'}
gl.ylabel_style = {'size': 10, 'color': 'black'}

plt.xlim((lon_min,lon_max))
plt.ylim((lat_min,lat_max))

plt.show()

This is the output I get after running the above code:

enter image description here

My desired end result would be something like this:

enter image description here

source

I thought about plotting in a loop but that took very long and yielded incorrect results in the end.


Solution

  • Ok, this was more work than I expected. I tried to write a generic procedure (not dependent on latitude/longitude range. Mapping HDF5 DNB data to the NumPy dnb array got too complicated. (And, there were a few matplotlib and cartopy surprises along the way.) Code and image are at the end.

    First, a clarification about the data. There are 20 files (It's a grid of 5x4 datasets/images.). Each covers a 10 deg x 10 deg grid. Overall lat/lon range of the files is +30 to +70 latitude and -10 to +30 longitude. To plot as one image, the x, y, and dnb arrays have to be sized to hold all 20 datasets, and the HDF5 DNB datasets have to be mapped to the dnb array.

    Here is the procedure:

    1. Loop over the files to get the max/min for latitude and longitude. (Since you know the range, you could skip this step and hard code the values.) I also get the DNB dataset type and shape, and use to size the x, y, and dnb arrays.
    2. Create the x and y arrays using the min/max longitude and latitude from values above.
    3. Loop over over the files (again) to read the DNB data and map to the dnb array using slice notation. I hard coded the slicing indices. (It could be done more elegantly.)
    4. Once the data is loaded, the matplotlib / cartopy procedure is almost the same -- only a few lines have to be changed to account for the map extents for all 20 files.

    Code for each step below:

    Step 1 - Get the max/min latitude/longitude values:

    folder = 'D:/PYTHON/NASA'
    h5files = glob.glob(folder+'/*.h5')
    n_files = len(h5files)
    n_files_lon = 5
    n_files_lat = 4
    lat_lon_arr = np.empty(shape=(n_files,4)) 
    for i, h5file in enumerate(h5files):
        with h5py.File(h5file) as h5f:
            dnb_attr_ds = h5f['HDFEOS/GRIDS/VNP_Grid_DNB']
            lat_min = dnb_attr_ds.attrs['SouthBoundingCoord'][0]
            lat_max = dnb_attr_ds.attrs['NorthBoundingCoord'][0]
            lon_min = dnb_attr_ds.attrs['WestBoundingCoord'][0]
            lon_max = dnb_attr_ds.attrs['EastBoundingCoord'][0]
            lat_lon_arr[i] = [round(lat_min), round(lat_max), round(lon_min), round(lon_max)]
            dnb_ds = h5f['HDFEOS/GRIDS/VNP_Grid_DNB/Data Fields/DNB_At_Sensor_Radiance_500m']
            dnb_ds_shape = dnb_ds.shape
    
    map_lat_min = np.amin(lat_lon_arr[:,0])
    map_lat_max = np.amax(lat_lon_arr[:,1])
    map_lon_min = np.amin(lat_lon_arr[:,2])
    map_lon_max = np.amax(lat_lon_arr[:,3])
    print(f'map_lat_min= {map_lat_min:.1f}, map_lat_max={map_lat_max:.1f}', 
          f'map_lon_min={map_lon_min:.1f}, map_lon_max= {map_lon_max:.1f}\n')
    

    Step 2 - Create the x and y arrays from min/max longitude and latitude values above:

    lats = np.linspace(map_lat_max, map_lat_min, n_files_lat*dnb_ds_shape[1])#.astype(np.float32)
    lons = np.linspace(map_lon_min, map_lon_max, n_files_lon*dnb_ds_shape[0])#.astype(np.float32)
    x, y = np.meshgrid(lons, lats)
    

    Step 3 - Create the dnb array and load H5 DNB data into it:

    dnb = np.empty(shape=x.shape,dtype=np.uint16)
    
    cnt_lat, cnt_lon = 0, 0
    for h5file in h5files:
        with h5py.File(h5file) as h5f:
            dnb_ds = h5f['HDFEOS/GRIDS/VNP_Grid_DNB/Data Fields/DNB_At_Sensor_Radiance_500m']
            dnb[cnt_lat*dnb_ds_shape[0]:(cnt_lat+1)*dnb_ds_shape[0], 
                cnt_lon*dnb_ds_shape[1]:(cnt_lon+1)*dnb_ds_shape[1]] = dnb_ds
            cnt_lat += 1
            if cnt_lat % n_files_lat == 0:
               cnt_lat = 0
               cnt_lon += 1
    

    Step 4 - matplotlib/cartopy changes to plot 5x4 grid:

    fig = plt.figure(figsize=(n_files_lat*6,n_files_lon*6))
    ...
    xgrid = np.arange(map_lon_min-2, map_lon_max+2, 2.)
    ygrid = np.arange(map_lat_min-2, map_lat_max+2, 2.)
    ...
    plt.xlim((map_lon_min,map_lon_max))
    plt.ylim((map_lat_min,map_lat_max))
    

    Here is the resulting image: enter image description here