Search code examples
pythoncolorsgeopandasshapefilecartopy

Mask area outside of a shape file with Cartopy and Geopandas


I have a set of data with (lon, lat, temperature) that I have plotted with Cartopy. The minimum example that I can give is the code below (with only 30 data points)

import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import matplotlib.colors as clr
import pandas as pd
import numpy as np

from metpy.interpolate import interpolate_to_grid, remove_nan_observations
from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature

canada_east = -95
canada_west = -101.8
canada_north = 52.8
canada_south = 48.85

central_lon = (canada_east + canada_west)/2
central_lat = (canada_north + canada_south)/2

crs = ccrs.LambertConformal(central_longitude = central_lon, central_latitude = central_lat)
lat = np.array([49.8134 50.904  50.698  49.095  49.436  49.9607 49.9601 49.356  50.116
49.402  52.3472 50.411  49.24   49.876  49.591  49.905  49.498  49.088
49.118  50.5947 49.3776 49.148  49.1631 51.358  49.826  50.4324 49.96
49.68   49.875  50.829  51.572])
lon = np.array([-100.3721  -97.273   -99.068   -97.528  -100.308   -98.9054  -98.6367
-99.248   -96.434  -100.93   -101.1099 -100.893  -100.055   -99.909
-97.518   -99.354   -98.03    -99.325   -99.054   -98.0035 -100.5387
-100.491   -97.1454 -100.361   -96.776   -99.4392  -97.7463  -97.984
-95.92    -98.111  -100.488])
tem = np.array([-8.45   -4.026  -5.993  -3.68   -7.35   -7.421  -6.477  -8.03   -3.834
-13.04   -4.057  -8.79   -6.619 -10.89   -4.465  -8.41   -4.861  -9.93
-7.125  -4.424 -11.95   -9.56   -3.86   -7.17   -4.193  -7.653  -4.883
-5.631  -3.004  -4.738  -8.81])

xp, yp, _ = crs.transform_points(ccrs.PlateCarree(), lon, lat ).T
xp, yp, tem = remove_nan_observations(xp, yp, tem)

alt_x, alt_y, data = interpolate_to_grid( xp, yp, tem, minimum_neighbors=2, search_radius=240000, interp_type = 'barnes', hres = 1000)

# Create the figure and grid for subplots
fig = plt.figure(figsize=(17, 12))

# Main ax
ax = plt.subplot(111, projection=crs)
ax.set_extent([canada_west, canada_east, canada_south, canada_north], ccrs.PlateCarree())

# Ading province borders and country borders
provinces_bdr = cfeature.NaturalEarthFeature(category = 'cultural',
                                                name = 'admin_1_states_provinces_lines',
                                                scale = '50m',
                                                linewidth = 0.6,
                                                facecolor='none',
                                                )  # variable to add provinces border

country_bdr = cfeature.NaturalEarthFeature(category= 'cultural',
                                            name = 'admin_0_boundary_lines_land', 
                                            scale = '50m', 
                                            linewidth = 1.5,
                                            facecolor = 'none',
                                            edgecolor = 'k')
                                                
ax.add_feature(provinces_bdr, linestyle='--')
ax.add_feature(country_bdr, linestyle='--')

ax.add_feature(cfeature.LAND)
ax.add_feature(cfeature.OCEAN)
ax.add_feature(cfeature.BORDERS)

cf = ax.pcolormesh(alt_x, alt_y, data, cmap=plt.cm.rainbow)

# Read the shape file and add it

shape_feature = ShapelyFeature(Reader('MB_AGregion_Perim_South.shp').geometries(), ccrs.epsg(26914), linewidth = 1, facecolor = (1, 1, 1, 0), edgecolor = (0.5, 0.5, 0.5, 1))

ax.add_feature(shape_feature)
plt.show()

which gives this result:

this result

where the gray line inside is produced by the shape file. Now I want to limit the coloring to be only inside the shape file (so area that's outside of the gray line should not be colored by pcolormesh) but I can not find a way that work. I have read this example and this example but I cannot understand both of them. Is there a simple way to do this using geopandas and/or cartopy alone?

Sorry I cannot upload the shape file here, this is the best minimal example I could have done. If there are any improvements I should have done please tell me. I'm new to stack overflow and I'm open to critiques.

Edit1: To clarify, the shape file I want the color to be limited to is the 'MB_AGregion_Perim_South.shp' that I read with ShapelyFeature (the last 4 lines of my code), and it draw the grey line that bounds most part of my coloring.

Edit 2: As @Michael Delgado suggested, I have added this lines of code:

cat_gdf = geopandas.read_file('MB_AGregion_Perim_South.shp')
cat_gdf = cat_gdf.to_crs(epsg = 4326)
mask = shapely.vectorized.contains(cat_gdf.dissolve().geometry.item(), alt_x, alt_y)

where alt_x and alt_y is the interpolated result (please look at my example above). The shape file has epsg = 26914 originally, so I transform it into 4326.

The problem is that the mask contains all false values (which means it mask everything). I doubted that it's because alt_x and alt_y are coordinates that has been transformed with crs.transform_points(ccrs.PlateCarree(), lon, lat ).T (as my code showed above). I have search around and try to get the shape file into different epsg values but it doesn't work. Also, cat_gdf.geometry is a multi polygons. Could it be the cause here?

For anyone who's struggling with this in the future, here is a detailed explanation of the solution


Solution

  • Quick MRE:

    import numpy as np, pandas as pd, geopandas as gpd
    import matplotlib.pyplot as plt
    
    x = np.arange(-126, -105, 0.1)
    y = np.arange(25, 46, 0.1)
    xx, yy = np.meshgrid(x, y)
    xnorm = (xx - xx.min()) / (xx.max() - xx.min())
    ynorm = (yy - yy.min()) / (yy.max() - yy.min())
    v = np.cos((xnorm * 2 - 1) * np.pi) + np.sin((ynorm * 2 - 1) * np.pi)
    
    gdf = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
    
    fig, ax = plt.subplots()
    ax.pcolormesh(xx, yy, v)
    xlim, ylim = ax.get_xlim(), ax.get_ylim()
    gdf.plot(ax=ax, color='none', edgecolor='k')
    ax.set_xlim(*xlim)
    ax.set_ylim(*ylim)
    

    pcolormesh of gridded data with shapefile overlay

    You can use shapely.vectorized to mask a set of x, y points using a shapely.geometry object:

    import shapely.vectorized
    mask = shapely.vectorized.contains(gdf.dissolve().geometry.item(), xx, yy)
    
    fig, ax = plt.subplots()
    ax.pcolormesh(xx, yy, np.where(mask, v, np.nan))
    xlim, ylim = ax.get_xlim(), ax.get_ylim()
    gdf.plot(ax=ax, color='none', edgecolor='k')
    ax.set_xlim(*xlim)
    ax.set_ylim(*ylim)
    

    pcolormesh plot of gridded data, masked to polygons