Search code examples
pythonpolygonnetcdfshapefilemasking

Python: Masking ERA5 data (NetCDF) from shapefile (polygon/multipolygon)


I want to select grid cells from ERA5 gridded data (surface level only) that are inside geographical masks for North- and South-Switzerland (plus the radar buffer), to calculate regional means. The 4 masks (masks) are given as polygons/multipolygons (polygons) in a shapefile and so far for 2 of the masks I was able to use salem roi to get what I want:

radar_north = salem.read_shapefile('radar_north140.shp')
file_radar_north = file.salem.roi(shape=radar_north)
file_radar_north.cape.mean(dim='time').salem.quick_map() 

However, for the radar_south and alpensuedseite shapefiles the code didn´t work at the beginning (wrong selection or shows no data), and now the nothing works anymore (?). I don´t know why, as I have not changed anything from the first time to the second.

If someone sees the issue or knows a different way to mask the ERA data (which is maybe quicker) I would be grateful! (I was unsuccessfull with the answers from similar questions here).

Best Lena


Solution

  • This could work if you are working on netcdf files

    import geopandas as gpd
    import xarray as xr
    import rioxarray
    from shapely.geometry import mapping
    
    # load shapefile with geopandas
    radar_north = gpd.read_file('radar_north140.shp')
    
    # load ERA5 netcdf with xarray
    era = xr.open_dataset('ERA5.nc')
    
    # add projection system to nc
    era = era.rio.write_crs("EPSG:4326", inplace=True)
    
    # mask ERA5 data with shapefile
    era_radar_north = era.rio.clip(radar_north.geometry.apply(mapping), radar_north.crs)