I'm trying to create a mask following this question's answer, so that I can color areas inside my shape file only. I will post the code example here (which is pretty much the same with the code in my previous question, but with the solution code included).
A quick summary of my code: It's a data of (lon, lat, tem). I transform the lon and lat into ccrs.PlateCaree()
coordinate system (which is in meter and not degree anymore), and put the transformed coordinates in xp, yp. The 3 variables xp, yp, and tem are then interpolated and then I use pmeshcolor() to draw a color map (the same color map in my previous question that I left the link above). Now I want to limit the color map to be inside the shape file only (which is the grey lines). Code is as below:
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt
import matplotlib.colors as clr
import pandas as pd
import numpy as np
from metpy.interpolate import interpolate_to_grid, remove_nan_observations
from cartopy.io.shapereader import Reader
from cartopy.feature import ShapelyFeature
canada_east = -95
canada_west = -101.8
canada_north = 52.8
canada_south = 48.85
central_lon = (canada_east + canada_west)/2
central_lat = (canada_north + canada_south)/2
crs = ccrs.LambertConformal(central_longitude = central_lon, central_latitude = central_lat)
lat = np.array([49.8134 50.904 50.698 49.095 49.436 49.9607 49.9601 49.356 50.116
49.402 52.3472 50.411 49.24 49.876 49.591 49.905 49.498 49.088
49.118 50.5947 49.3776 49.148 49.1631 51.358 49.826 50.4324 49.96
49.68 49.875 50.829 51.572])
lon = np.array([-100.3721 -97.273 -99.068 -97.528 -100.308 -98.9054 -98.6367
-99.248 -96.434 -100.93 -101.1099 -100.893 -100.055 -99.909
-97.518 -99.354 -98.03 -99.325 -99.054 -98.0035 -100.5387
-100.491 -97.1454 -100.361 -96.776 -99.4392 -97.7463 -97.984
-95.92 -98.111 -100.488])
tem = np.array([-8.45 -4.026 -5.993 -3.68 -7.35 -7.421 -6.477 -8.03 -3.834
-13.04 -4.057 -8.79 -6.619 -10.89 -4.465 -8.41 -4.861 -9.93
-7.125 -4.424 -11.95 -9.56 -3.86 -7.17 -4.193 -7.653 -4.883
-5.631 -3.004 -4.738 -8.81])
xp, yp, _ = crs.transform_points(ccrs.PlateCarree(), lon, lat ).T
xp, yp, tem = remove_nan_observations(xp, yp, tem)
alt_x, alt_y, data = interpolate_to_grid( xp, yp, tem, minimum_neighbors=2, search_radius=240000, interp_type = 'barnes', hres = 1000)
# Create the figure and grid for subplots
fig = plt.figure(figsize=(17, 12))
# Main ax
ax = plt.subplot(111, projection=crs)
ax.set_extent([canada_west, canada_east, canada_south, canada_north], ccrs.PlateCarree())
# Read shape file again using geopandas and create a mask
cat_gdf = geopandas.read_file('MB_AGregion_Perim_South.shp')
cat_gdf = cat_gdf.to_crs(epsg = 4326)
mask = shapely.vectorized.contains(cat_gdf.dissolve().geometry.item(), alt_x, alt_y)
print(mask)
which gives
[[False False False ... False False False]
[False False False ... False False False]
[False False False ... False False False]
...
[False False False ... False False False]
[False False False ... False False False]
[False False False ... False False False]]
Process finished with exit code 0
My alt_x and alt_y is in meter and my shape file geometries are in degree. I believe this is the reason why the mask contains all false value (which means masking everything). However, I'm unable to find a solution to this problem. I would appreciate a lot if someone can help me with this problem.
Here's another MRE. Keeping things very simple - I'm just using the bbox and vector of (x, y) values rather than the mesh you have. But it should be enough to illustrate the issue you're dealing with.
import cartopy.crs as ccrs
import shapely.geometry
import shapely.vectorized
import numpy as np, pandas as pd, geopandas as gpd, matplotlib.pyplot as plt
canada_east = -95
canada_west = -101.8
canada_north = 52.8
canada_south = 48.85
central_lon = (canada_east + canada_west)/2
central_lat = (canada_north + canada_south)/2
crs = ccrs.LambertConformal(
central_longitude=central_lon,
central_latitude=central_lat,
)
x = np.arange(-120, -90)
y = np.ones_like(x) * 50
bbox = shapely.geometry.Polygon([
(canada_west, canada_south),
(canada_east, canada_south),
(canada_east, canada_north),
(canada_west, canada_north),
])
gdf = gpd.GeoDataFrame(geometry=[bbox], crs='epsg:4326')
Here's what x, y, and the geodataframe look like. Note that they're all in degrees, techincally WGS84, aka 'epsg:4326'
:
In [2]: print(f"x values: {x}\n\ny values: {y}\n\ngeodataframe:\n{gdf}")
x values: [-120 -119 -118 -117 -116 -115 -114 -113 -112 -111 -110 -109 -108 -107
-106 -105 -104 -103 -102 -101 -100 -99 -98 -97 -96 -95 -94 -93
-92 -91]
y values: [50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50
50 50 50 50 50 50]
geodataframe:
geometry
0 POLYGON ((-101.80000 48.85000, -95.00000 48.85...
If we use shapely.vectorized.contains when both the points and the geodataframe are in degrees, we get the expected answer:
In [3]: shapely.vectorized.contains(gdf.dissolve().geometry.item(), x, y)
Out[3]:
array([False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, True, True, True, True, True, True, False, False,
False, False, False])
The CRS PlateCarree()
is a plotting CRS which preserves lat/lon values, so you can think of it as equivalent to WGS84 (only difference is PlateCarree has a defined mapping to pixel values). So the following translates our lat/lon values into the CRS of our object CRS
, which is LambertConformal
:
In [4]: xp, yp, _ = crs.transform_points(ccrs.PlateCarree(), x, y ).T
Now, our xp
and yp
values are in meters and look very different from the values in the geodataframe:
In [5]: print(f"xp values: {xp}\n\nyp values: {yp}\n\ngeodataframe:\n{gdf}")
xp values: [-1555398.30438114 -1484657.61843652 -1413737.15236547 -1342645.49406745
-1271391.25217193 -1199983.05499594 -1128429.54949926 -1056739.40023735
-984921.28831211 -912983.91032072 -840935.97730252 -768786.21368415
-696543.35622316 -624216.15294999 -551813.36210868 -479343.75109632
-406816.09540141 -334239.17754116 -261621.78599806 -188972.71415562
-116300.7592336 -43614.72122269 29076.59818104 101764.396642
174439.87225098 247094.22459092 319718.65580267 392304.37165025
464842.58258582 537324.50481399]
yp values: [ 92537.27851361 75810.36405389 59862.8937992 44696.79886026
30313.91572949 16715.98605863 3904.65644788 -8118.52175353
-19352.09263517 -29794.69590175 -39445.06703778 -48302.03746074
-56364.53466258 -63631.58233956 -70102.30051051 -75775.90562338
-80651.71065011 -84729.12516982 -88007.65544033 -90486.90445793
-92166.57200545 -93046.45468863 -93126.44596073 -92406.53613545
-90886.8123881 -88567.45874503 -85448.75606135 -81531.08198695
-76814.91092072 -71300.81395314]
geodataframe:
geometry
0 POLYGON ((-101.80000 48.85000, -95.00000 48.85...
At this point, if we use contains
to find points within the geodataframe, it's clear why we get only False values:
In [6]: shapely.vectorized.contains(gdf.dissolve().geometry.item(), xp, yp)
Out[6]:
array([False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False, False])
To convert the geodataframe to our target crs
, we can use to_crs
:
In [7]: gdf_lambert = gdf.to_crs(crs)
Now, our values line up, and contains works as expected once again:
In [8]: print(f"xp values: {xp}\n\nyp values: {yp}\n\ngdf_lambert:\n{gdf_lambert}")
xp values: [-1555398.30438114 -1484657.61843652 -1413737.15236547 -1342645.49406745
-1271391.25217193 -1199983.05499594 -1128429.54949926 -1056739.40023735
-984921.28831211 -912983.91032072 -840935.97730252 -768786.21368415
-696543.35622316 -624216.15294999 -551813.36210868 -479343.75109632
-406816.09540141 -334239.17754116 -261621.78599806 -188972.71415562
-116300.7592336 -43614.72122269 29076.59818104 101764.396642
174439.87225098 247094.22459092 319718.65580267 392304.37165025
464842.58258582 537324.50481399]
yp values: [ 92537.27851361 75810.36405389 59862.8937992 44696.79886026
30313.91572949 16715.98605863 3904.65644788 -8118.52175353
-19352.09263517 -29794.69590175 -39445.06703778 -48302.03746074
-56364.53466258 -63631.58233956 -70102.30051051 -75775.90562338
-80651.71065011 -84729.12516982 -88007.65544033 -90486.90445793
-92166.57200545 -93046.45468863 -93126.44596073 -92406.53613545
-90886.8123881 -88567.45874503 -85448.75606135 -81531.08198695
-76814.91092072 -71300.81395314]
gdf_lambert:
geometry
0 POLYGON ((-251935.076 -217891.780, 251935.076 ...
In [9]: shapely.vectorized.contains(gdf_lambert.dissolve().geometry.item(), xp, yp)
Out[9]:
array([False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, True, True, True, True, True, True, True, False,
False, False, False])
We can also see this very clearly when plotting. Here's the points and shapefile in lat/lon space, using the contains
mask to color the points blue if outside and green if inside:
In [12]: fig, ax = plt.subplots()
...: gdf.plot(ax=ax, alpha=0.5)
...: mask = shapely.vectorized.contains(gdf.dissolve().geometry.item(), x, y)
...: ax.scatter(x[mask], y[mask], color='green')
...: ax.scatter(x[~mask], y[~mask], color='blue')
Out[12]: <matplotlib.collections.PathCollection at 0x18f5a3ac0>
If we mix and match projections, none of the points are in the shape (which is a tiny invisible spec near (0, 0) when the axes are scaled by meters):
In [14]: fig, ax = plt.subplots()
...: gdf.plot(ax=ax, alpha=0.5)
...: mask = shapely.vectorized.contains(gdf.dissolve().geometry.item(), xp, yp)
...: ax.scatter(xp[mask], yp[mask], color='green')
...: ax.scatter(xp[~mask], yp[~mask], color='blue')
Out[14]: <matplotlib.collections.PathCollection at 0x1919e58d0>
When the shapefile and the points are both in LambertConformal, we're back in business:
In [16]: fig, ax = plt.subplots()
...: gdf_lambert.plot(ax=ax, alpha=0.5)
...: mask = shapely.vectorized.contains(gdf_lambert.dissolve().geometry.item(), xp, yp)
...: ax.scatter(xp[mask], yp[mask], color='green')
...: ax.scatter(xp[~mask], yp[~mask], color='blue')
Out[16]: <matplotlib.collections.PathCollection at 0x191a88ee0>