Search code examples
pythongispolygonshapely

Substracting inner rings from a list of polygons


I have a large list of polygons (>10^6) most of which are non-intersecting but some of these polygons are hole of another polygon (~10^3 cases). Here a image to explain the problem, the smaller polygon is a hole in the larger polygon but both are independent polygon in the list of polygons. A large polygon with smaller polygon on the inside.

Now I would like to efficiently determine which polygons are holes and substract the holes i.e. subtract the smaller polygons which lie completely inside in another polygon and return a list of "cleaned" polygons. A pair of hole and parent polygon should be transformed like this (so basically hole subtracted from the parent): enter image description here

There are plenty of similar questions on Stackoverflow and gis.stackexchange.com but I haven't found one that actually solves this problems. Here are some related questions: 1. https://gis.stackexchange.com/questions/5405/using-shapely-translating-between-polygons-and-multipolygons 2. https://gis.stackexchange.com/questions/319546/converting-list-of-polygons-to-multipolygon-using-shapely

Here is a sample code.

from shapely.geometry import Point
from shapely.geometry import MultiPolygon
from shapely.ops import unary_union
import numpy as np

#Generate a list of polygons, where some are holes in others; 
def generateRandomPolygons(polygonCount = 100, areaDimension = 1000, holeProbability = 0.5):
    pl = []
    radiusLarge = 2 #In the real dataset the size of polygons can vary
    radiusSmall = 1 #Size of holes can also vary

    for i in range(polygonCount):
        x, y = np.random.randint(0,areaDimension,(2))
        rn1 = np.random.random(1)
        pl.append(Point(x, y).buffer(radiusLarge))
        if rn1 < holeProbability: #With a holeProbability add a hole in the large polygon that was just added to the list
            pl.append(Point(x, y).buffer(radiusSmall))
    return pl

polygons = generateRandomPolygons()
print(len(pl))

Output looks like this: enter image description here

Now how can I create a new list of polygons with the holes removed. Shapely provides functions to subtract one polygon from another (difference) but is there a similar function for lists of polygons (maybe something like unary_union but where overlaps are removed)? Alternative how to efficiently determine which are holes and then subtract them from the larger polygons?


Solution

  • Your problem is you don't know which ones are "holes", right? To "efficiently determine which polygons are holes", you can use an rtree to speed up the intersection check:

    from rtree.index import Index
    
    # create an rtree for efficient spatial queries
    rtree = Index((i, p.bounds, None) for i, p in enumerate(polygons))
    donuts = []
    
    for i, this_poly in enumerate(polygons):
        # loop over indices of approximately intersecting polygons
        for j in rtree.intersection(this_poly.bounds):
            # ignore the intersection of this polygon with itself
            if i == j:
                continue
            other_poly = polygons[j]
            # ensure the polygon fully contains our match
            if this_poly.contains(other_poly):
                donut = this_poly.difference(other_poly)
                donuts.append(donut)
                break  # quit searching
    
    print(len(donuts))