Search code examples
pythongisgeopandascliprasterio

Clipping geodataframe points by raster extent in Python


Background

Thanks to the kind souls who answered my previous question here: Faster methods to create geodataframe from a Dask or Pandas dataframe, I have successfully imported my massive block model as a geodataframe. This was done with the following code:

import dask_geopandas 
import dask
from dask import dataframe as dd
import geopandas as gpd  

BM = dd.read_csv(BM_path, skiprows=2, names=["X","Y","Z","Lith"])
BM["geometry"] = dask_geopandas.points_from_xy(BM,"X","Y")
BM = dask_geopandas.from_dask_dataframe(BM,geometry="geometry")
BM = BM.compute()

I've even been able to find code on here to convert my raster to a geodataframe as well!:

import rasterio as rio

with rio.Env():
    with rio.open(RAS_path) as src:
        crs = src.crs
        xmin, ymax = np.around(src.xy(0.00,0.00),9)
        xmax, ymin = np.around(src.xy(src.height-1, src.width-1),9)
        x = np.linspace(xmin, xmax, src.width)
        y = np.linspace(ymax, ymin, src.height)
        
        xs, ys = np.meshgrid(x,y)
        zs = src.read(1)
        
        mask = src.read_masks(1) > 0
        xs, ys, zs = xs[mask], ys[mask], zs[mask]
  
data = {'X': pd.Series(xs.ravel()),
        'Y': pd.Series(ys.ravel()),
        'Z': pd.Series(zs.ravel())}
          
df = pd.DataFrame(data=data)
geometry = gpd.points_from_xy(df.X,df.Y)
gdf = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)

Though I'm not sure how to use this geodataframe to clip my block model, so it may have been useless...

I know how to do this in ArcGIS but its slow and clunky. I want to streamline this entire process in python without even opening Esri products.

Problem

I have been struggling to find a reliable method to clip my block model (BM) according to the X-Y extent of the raster that I exported from ArcGIS (this raster has float type value, not sure if relevant).

Using clip() I have had no luck as I believe a raster cannot be used to clip points (must convert to a polygon?) in this tool. I am wondering if anyone knows a reliable method for clipping my geodataframe points (as a regular dataframe or as a geodataframe) to a raster extent?

Initially I thought I could create a flat, single polygon from my raster extent and use that with the clip() tool but alas I am also not sure how to open a raster and change the values without reading it in as a list with rasterio.open() and rasterio.read()


Solution

  • How would you like to use your grid? Do you want to use the whole extend of the grid? or does the grid have mask inside?

    You probably need to create a vector denoting the boundary of the grid extend if you want to do a spatial clip (other methods would require the X and Y of both datasets to coinside). If you want to use the extend of the whole raster, takes the corners of the raster using rasterio (by using eg src.transform). Create a shapely polygon with those values. https://shapely.readthedocs.io/en/stable/manual.html#Polygon

    then BM.clip([polygon])

    If you want to clip using a irregular mask inside the geotiff, use rasterio.features.shapes to create polygons from the raster. https://rasterio.readthedocs.io/en/latest/api/rasterio.features.html#rasterio.features.shapes you can transform these to geopandas.GeoDataFrame of ease of use. geopandas.GeoDataFrame.from_features(geoms)

    You also need to decide what to do with any possible holes inside the grid. If you only want the outer the shell, you might be able to get away trying to just get the largest polygon.