Search code examples
pythonpolygongeopandasshapelyrasterio

How to avoid polygons getting "filled" while using rasterio and geopandas


I have a numpy array that's filled with 0's and 1's. I'm trying to convert the cells filled with 1's into polygons. To do this I'm using rasterio, shapely and geopandas.

def flooded_to_shapefile(array, transform, filename):

    shapes = rasterio.features.shapes(array, transform = transform)
    
    # Select only cells with 1's
    polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) 
                for shape in shapes if shape[1] == 1]
    
    # Create dataframe with polygons
    df = gpd.GeoDataFrame(polygons)
    df.columns = ["geometry"]
    df.set_geometry(col='geometry', inplace=True)
    df.to_file(filename)

The problem I'm having is that in the matrix I have holes that are not correctly represented in the final polgyons. For example, look at this:

In the colored representation of my matrix I have white areas that I want to polygonize Matrix

When I run the my code I get the following.

Resulting polygons

As you can see, the star shape, for instance, is incorporated into the polygon, but since it contains 0 values it should not. The line shape[1] == 1 should be preventing the star getting selected (as in here, but the polygon still "closes". Do you have any idea of why this happens?


Solution

  • Initializing the Polygon object uses 2 arguments, shell (the outer ring) and holes (the inner rings). You are only giving it the shell so you get a filled polygon.

    I think you can give each shape (a geojson-like dict) in your rasterio shapes as argument to the shapely.geometry.shape function in order to get polygons with holes.

    # Select only cells with 1's
    polygons = [
        shapely.geometry.shape(shape[0]) 
        for shape in shapes if shape[1] == 1
    ]
    

    should do the trick