Search code examples
c++c++11rangeset

A container for integer intervals, such as RangeSet, for C++


I am trying to work with ranges, as in ranges of numbers. By that I mean integer intervals, in maths speak. And I want to store a set of them. I also want this set to naturally merge (or coalesce) ranges I insert.

Let's go for a simple example, I start with an empty set: { }

  • I insert the range [0,5], now I have { [0,5] }
  • I insert the range [10,15], now I have { [0,5], [10,15] }
  • I insert the range [5,7], now I have { [0,7], [10,15] }
  • I insert the range [12,17], now I have { [0,7], [10,17] }
  • I insert the range [6,13], now I have { [0,17] }

I found out thanks to a similar question that this exists in Java as a Google Guava library and is called a RangeSet.

I was initially thinking of using a std::set of std::pairs that would be sorted on the lower bound (so the first element of each pair). Then after each insertion I would have to manually merge any overlapping sets.

As this seems a common problem, is there a good implementation I couldn't find due to the noise with all the synonyms of "range" in C++? Or does anyone care to share their own? I only want to print the final ranges but bonus points for generality if you have other set operations.


Solution

  • If you encode ranges as a sequence of endpoints and step direction, instead of start/end pairs, then finding union should become much easier, just a simple mergesort.

    (0, +) (5, -)
    
    (0, +) (5, -) (10, +) (15, -)
    
    (0, +) (5, +) (5, -) (7, -) (10, +) (15, -)
    

    Look, the overlapping range appears as nested ranges. Just preserve only the outermost ones.

    (0, +) (5, +) (5, -) (7, -) (10, +) (15, -)
       1      2      2     1       1       1       <= depth
    (0, +) (7, -) (10, +) (15, -)
    
    (0, +) (7, -) (10, +) (12, +) (15, -) (17, -)
       1      1      1       2       2       1
    (0, +) (7, -) (10, +) (17, -)
    
    
    (0, +) (6, +) (7, -) (10, +) (13, -) (17, -)
       1      2      2      2       2       1
    (0, +) (17, -)
    

    I think that finding intersections becomes simple also, now you preserve only endpoints with a nesting level of 2, instead of deleting them.