How can I efficiently split a set of intervals (the input set) into a minimal set of disjoint intervals (the output set), in such a way that all intervals from the input set can be expressed as unions of intervals from the output set ?
Examples :
Input: [0,9] [2,12]
Output: [0,1] [2,9] [10,12]
Test :
[0,9] = [0,1] ∪ [2,9]
[2,12] = [2,9] ∪ [10,12]
Input: [0,Infinity] [1,5] [4,6]
Output: [0,0] [1,3] [4,5] [6,6] [7,Infinity]
Test :
[0,Infinity] = [0,0] ∪ [1,3] ∪ [4,5] ∪ [6,6] ∪ [7,Infinity]
[1,5] = [1,3] ∪ [4,5]
[4,6] = [4,5] ∪ [6,6]
I need to do this in Javascript. Here is the idea I tried :
// The input is an array of intervals, like [[0,9], [2,12]], same for the output
// This function converts a set of overlapping
// intervals into a set of disjoint intervals...
const disjoin = intervals => {
if(intervals.length < 2)
return intervals
const [first, ...rest] = intervals
// ...by recursively injecting each interval into
// an ordered set of disjoint intervals
return insert(first, disjoin(rest))
}
// This function inserts an interval [a,b] into
// an ordered set of disjoint intervals
const insert = ([a, b], intervals) => {
// First we "locate" a and b relative to the interval
// set (before, after, or index of the interval within the set
const pa = pos(a, intervals)
const pb = pos(b, intervals)
// Then we bruteforce all possibilities
if(pa === 'before' && pb === 'before')
return [[a, b], ...intervals]
if(pa === 'before' && pb === 'after')
// ...
if(pa === 'before' && typeof pb === 'number')
// ...
// ... all 6 possibilities
}
const first = intervals => intervals[0][0]
const last = intervals => intervals[intervals.length-1][1]
const pos = (n, intervals) => {
if(n < first(intervals))
return 'before'
if(n > last(intervals))
return 'after'
return intervals.findIndex(([a, b]) => a <= n && n <= b)
}
But it is very inefficient. In the pos
function, I could do a binary search to speed things up, but I mainly wonder if :
Every boundary point in the input set also needs to be in the output set. The interval between each adjacent pair of boundary points is in the output if it's inside at least one input.
splitIntervals = (input) => {
const starts = input.map(x => [x[0],1]);
const ends = input.map(x => [x[1]+1, -1]);
let count=0;
let prev=null;
return [...starts, ...ends]
.sort((a,b) => (a[0] - b[0])) //sort boundary points
.map(x => {
//make an interval for every section that is inside any input interval
const ret= (x[0] > prev && count !== 0 ? [prev, x[0]-1] : null);
prev=x[0];
count+=x[1];
return ret;
})
.filter(x => !!x);
}
Test:
> splitIntervals([ [0,9], [2,12] ])
[ [ 0, 1 ], [ 2, 9 ], [ 10, 12 ] ]
> splitIntervals([[0,9], [3,9], [4,13]])
[ [ 0, 2 ], [ 3, 3 ], [ 4, 9 ], [ 10, 13 ] ]