Search code examples
c++cbitarraybitvector

Fast code for searching bit-array for contiguous set/clear bits?


Is there some reasonably fast code out there which can help me quickly search a large bitmap (a few megabytes) for runs of contiguous zero or one bits?

By "reasonably fast" I mean something that can take advantage of the machine word size and compare entire words at once, instead of doing bit-by-bit analysis which is horrifically slow (such as one does with vector<bool>).

It's very useful for e.g. searching the bitmap of a volume for free space (for defragmentation, etc.).


Solution

  • Windows has an RTL_BITMAP data structure one can use along with its APIs.

    But I needed the code for this sometime ago, and so I wrote it here (warning, it's a little ugly):
    https://gist.github.com/3206128

    I have only partially tested it, so it might still have bugs (especially on reverse). But a recent version (only slightly different from this one) seemed to be usable for me, so it's worth a try.

    The fundamental operation for the entire thing is being able to -- quickly -- find the length of a run of bits:

    long long GetRunLength(
        const void *const pBitmap, unsigned long long nBitmapBits,
        long long startInclusive, long long endExclusive,
        const bool reverse, /*out*/ bool *pBit);
    

    Everything else should be easy to build upon this, given its versatility.

    I tried to include some SSE code, but it didn't noticeably improve the performance. However, in general, the code is many times faster than doing bit-by-bit analysis, so I think it might be useful.

    It should be easy to test if you can get a hold of vector<bool>'s buffer somehow -- and if you're on Visual C++, then there's a function I included which does that for you. If you find bugs, feel free to let me know.