Search code examples
algorithmimage-processinggraphicsgeometrypolygon

Fastest algorithm for filling overlapping rectangles of pixels


I have an image. The pixel value is either 0 or 1. All pixels are initially zero.

Given a set of overlapping rectangles characterized by (bottom left pixel coordinate, width, height), I need to flip every pixel inside a rectangle. The following figure shows a toy example:

enter image description here

I have tens of millions of such images to fill. Each image has tens of thousands of pixels. The overlapping rectangles associated to each image are quite many.

The naïve approach is to just iterate through the set of rectangles for every image. But in our case, because the number of rectangles and the overlapping areas are all huge, the computational wastes are substantial.

I intend to code an algorithm by first finding the union sets for each column, like the following:

enter image description here

But before committing to it, is there a well established algorithm for solving this problem? It seems quite fundamental in computer vision / image processing.

Update, more context: the problem is related to nearest neighbor search across a large set of simulated hurricane events. A hurricane is a spatial series originally characterized by a sequence of pixels on an image. For each pixel, we engineer more features by coloring pixels in the same neighborhood. The neighborhood is usually a rectangle or a square. The feature engineering is run multiple times with increasing rectangle size, so each hurricane ends up with multiple images. These images are then flattened and concatenated to a bit array as the final feature vector for the hurricane. Jaccard distance is then used for searching the nearest neighbor. By profiling the code the bottleneck is the feature engineering, which prompted me to think about better way of filling those bit vectors.


Solution

  • Using with a very naive form of Sweep line algorithm :

    rects = [box(x, y, x + w, y + h) for (x, y), w, h in R]
    
    uu = unary_union(rects)
    _, ymin, _, ymax = uu.bounds
    xs = sorted({x for x, y in uu.exterior.coords})
    gcollections = [split(uu, LineString([(x, ymin), (x, ymax)])) for x in xs[1:-1]]
    
    hsegments = []
    for i, (lg, rg) in enumerate(pairwise(gcollections)):
        lp1, rp1, lp2, rp2 = chain.from_iterable(
            map(lambda geom: sorted(
                geom, key=lambda p: p.bounds[0]
            ), [lg.geoms, rg.geoms])
        )
        hseg = rp1.intersection(lp2)
        if i == 0:
            hsegments.append(lp1)
        hsegments.append(hseg)
        if i == (len(gcollections) - 2):
            hsegments.append(rp2)
    

    enter image description here

    Setup :

    from itertools import chain, pairwise
    
    import matplotlib.pyplot as plt
    from shapely import LineString, MultiLineString, Polygon, box, unary_union
    from shapely.ops import split
    from shapely.plotting import plot_polygon
    
    # set of rectangles (op)
    R = {
        ((60, 100), 89, 256),
        ((210, 58), 180, 212),
        ((120, 229), 209, 256),
        ((1, 143), 449, 171),
    }