Search code examples
javascriptalgorithmmathintervals

Split set of intervals into minimal set of disjoint intervals


How can I efficiently split a set of intervals (the input set) into a minimal set of disjoint intervals (the output set), in such a way that all intervals from the input set can be expressed as unions of intervals from the output set ?

Examples :

Input: [0,9] [2,12]
Output: [0,1] [2,9] [10,12]

Test :
[0,9] = [0,1] ∪ [2,9]
[2,12] = [2,9] ∪ [10,12]

Input: [0,Infinity] [1,5] [4,6]
Output: [0,0] [1,3] [4,5] [6,6] [7,Infinity]

Test :
[0,Infinity] = [0,0] ∪ [1,3] ∪ [4,5] ∪ [6,6] ∪ [7,Infinity]
[1,5] = [1,3] ∪ [4,5]
[4,6] = [4,5] ∪ [6,6]

I need to do this in Javascript. Here is the idea I tried :

// The input is an array of intervals, like [[0,9], [2,12]], same for the output

// This function converts a set of overlapping
// intervals into a set of disjoint intervals...
const disjoin = intervals => {
  if(intervals.length < 2)
    return intervals
  const [first, ...rest] = intervals
  // ...by recursively injecting each interval into
  // an ordered set of disjoint intervals
  return insert(first, disjoin(rest))
}

// This function inserts an interval [a,b] into
// an ordered set of disjoint intervals
const insert = ([a, b], intervals) => {
  // First we "locate" a and b relative to the interval
  // set (before, after, or index of the interval within the set
  const pa = pos(a, intervals)
  const pb = pos(b, intervals)

  // Then we bruteforce all possibilities
  if(pa === 'before' && pb === 'before') 
    return [[a, b], ...intervals]
  if(pa === 'before' && pb === 'after')
    // ...
  if(pa === 'before' && typeof pb === 'number')
    // ...
  // ... all 6 possibilities
}

const first = intervals => intervals[0][0]
const last = intervals => intervals[intervals.length-1][1]

const pos = (n, intervals) => {
  if(n < first(intervals))
    return 'before'
  if(n > last(intervals))
    return 'after'
  return intervals.findIndex(([a, b]) => a <= n && n <= b)
}

But it is very inefficient. In the pos function, I could do a binary search to speed things up, but I mainly wonder if :

  • This is a known problem and has a name in the algorithmic world
  • There is an optimal solution which has nothing to do with what I tried

Solution

  • Every boundary point in the input set also needs to be in the output set. The interval between each adjacent pair of boundary points is in the output if it's inside at least one input.

    splitIntervals = (input) => {
        const starts = input.map(x => [x[0],1]);
        const ends = input.map(x => [x[1]+1, -1]);   
        let count=0;
        let prev=null;
        return [...starts, ...ends]
            .sort((a,b) => (a[0] - b[0])) //sort boundary points
            .map(x => {
                //make an interval for every section that is inside any input interval
                const ret= (x[0] > prev && count !== 0 ? [prev, x[0]-1] : null);
                prev=x[0];
                count+=x[1];
                return ret;
            })
            .filter(x => !!x);
    }
    

    Test:

    > splitIntervals([ [0,9], [2,12] ])
    [ [ 0, 1 ], [ 2, 9 ], [ 10, 12 ] ]
    > splitIntervals([[0,9], [3,9], [4,13]])
    [ [ 0, 2 ], [ 3, 3 ], [ 4, 9 ], [ 10, 13 ] ]