Search code examples
pythonrastergdal

Clip raster into numpy arrays based on shapefile or polygons


I have a big raster file that I want to clip to 6 numpy arrays based on the polygons or shapefile. I have the shapefile and also the polygons as geopandas dataframe. Can anyone please help me how to do it using python (no arcpy)


Solution

  • I've created a little generator which should do what you need. I've opted for the generator instead of iterating over the features directly, because it's more convenient if you'd like to inspect the arrays. If you want, you can still iterate over the generator.

    import gdal
    import ogr, osr
    
    # converts coordinates to index
    
    def bbox2ix(bbox,gt):
        xo = int(round((bbox[0] - gt[0])/gt[1]))
        yo = int(round((gt[3] - bbox[3])/gt[1]))
        xd = int(round((bbox[1] - bbox[0])/gt[1]))
        yd = int(round((bbox[3] - bbox[2])/gt[1]))
        return(xo,yo,xd,yd)
    
    def rasclip(ras,shp):
        ds = gdal.Open(ras)
        gt = ds.GetGeoTransform()
    
        driver = ogr.GetDriverByName("ESRI Shapefile")
        dataSource = driver.Open(shp, 0)
        layer = dataSource.GetLayer()
    
        for feature in layer:
    
            xo,yo,xd,yd = bbox2ix(feature.GetGeometryRef().GetEnvelope(),gt)
            arr = ds.ReadAsArray(xo,yo,xd,yd)
            yield arr
    
        layer.ResetReading()
        ds = None
        dataSource = None
    

    Assuming your shapefile is called shapefile.shp and your raster big_raster.tif you can use it like so:

    gen = rasclip('big_raster.tif','shapefile.shp')
    

    # manually with next
    
    clip = next(gen)
    
    ## some processing or inspection here
    
    # clip with next feature
    
    clip = next(gen)
    

    # or with iteration
    
    for clip in gen:
    
        ## apply stuff to clip
        pass # remove