Search code examples
computational-geometryquadtreepoint-in-polygon

algorithm to recursively divide a polygon into in/out quadrants: what's it called and where's the code?


I have a lot of points (hundreds of thousands) and I want to check which ones are inside a polygon. For a relatively small polygon (i.e., likely to contain only tens or hundreds of points) I can just use the bounding box of the polygon as an initial check, and then do a regular point-in-poly check for those points inside the box. But imagine a large (i.e., likely to contain thousands of my points), irregularly shaped polygon. Many points will pass the bounding box check, and furthermore the point-in-poly check will be more expensive because the larger polygon is made up of many more points. So I'd like to be able to filter most points in or out without having to do the full point-in-poly check.

So, I have a plan, and mainly I want to know if what I'm describing is a well-known algorithm, and if so what it's called and where I might find existing code for it. I don't believe what I'm describing is either a quad-tree or an r-tree, and I don't know how to search for it. I'm calling it a "rect tree" below.

The idea is, to handle these larger polygons:

Do a "rect tree" pre-process, where the depth of the rect tree varies by the size of the polygon (i.e., allow more depth for a larger polygon). The rect tree would divide the bounding box of the polygon into four quarters. It would check if each quarter-rect is fully inside the polygon, fully outside the polygon, or neither. In the case of neither it would recursively divide the subrects, continuing in this way until all rects were either fully inside or outside, or the max depth had been reached. So the idea is that (a) the pre-processing time to make this tree, even though it itself will do several point-in-polygon checks, is well worth it because that time is dwarfed by the number of points to be checked, and (b) the vast majority of points can be dealt with using simple bounding box checks (generally a few such checks as you descend the tree), and then a relatively small number would have to do the full point-in-polygon check (for when you reach a leaf node that is still "neither").

What's that algorithm called? And where is the code? It doesn't in fact seem so hard to write, but I figured I'd ask before jumping into coding.


Solution

  • I actually ended up using a related but different approach. I realized that essentially this tree structure I was building up was no more than the polygon drawn at a low resolution. For instance, if my tree went down to a depth of 8, really that was just like drawing my polygon on a bitmap with resolution 256x256 and then doing pixel hit tests against that polygon. So I extended that idea and used a fast graphics library (the CImg library). I draw the polygon on a black-and-white bitmap of size 4000x4000. Then I just check the points as pixels against that bitmap. The magic is that drawing that huge bitmap is really fast compared to the time it was taking me to construct the tree. So I get much higher resolution than I ever could have with my tree.

    One issue is being able to detect points near the very edge of the polygon, which may be included or excluded incorrectly due to rounding/resolution issues, even at the 4000x4000 size. If you need to know precisely whether those points are in or out, you could draw a stroke around the polygon in another color, and if your pixel test hit that color, you'd do the full point in poly check. For my purposes the 4000x4000 resolution was good enough (I could tolerate incorrect inclusion/exclusion for some of my edge points).

    So the fundamental trick of this solution is the idea that polygon drawing algorithms are just super fast compared to other ways you might "digitize" your polygon.