Search code examples
algorithmintervalsinterval-tree

find all combinations with non-overlapped regions


Within a super-region S, there are k small subregions. The number k can be up to 200. There may be overlap between subregions. I have millions of regions S.

For each super-region, my goal is to find out all combinations in which there are 2 or more non-overlapped subregions.

Here is an example:

Super region: 1-100

Subregions: 1-8, 2-13, 9-18, 15-30, 20-35

Goal:

Combination1: 1-8, 9-18

Combination2: 1-8, 20-35

Combination3: 1-8, 9-18, 20-35

Combination4: 1-8, 15-30

...


Solution

  • Number of subsets might be exponential (max 2^k), so there is nothing wrong to traverse all possible independent subsets with recursion. I've used linear search of the next possible interval, but it is worth to exploit binary search.

    def nonovl(l, idx, right, ll):
        if idx == len(l):
            if ll:
                print(ll)
            return
    
        #find next non-overlapping interval without using l[idx]
        next = idx + 1  
        while next < len(l) and right >= l[next][0]:
            next += 1
        nonovl(l, next, right, ll)
    
        #find next non-overlapping interval after using l[idx]
        next = idx + 1
        right = l[idx][1]
        while next < len(l) and right >= l[next][0]:
            next += 1
        nonovl(l, next, right, ll + str(l[idx]))
    
    l=[(1,8),(2,13),(9,18),(15,30),(20,35)]
    l.sort()
    nonovl(l, 0, -1, "")
    
    (20, 35)
    (15, 30)
    (9, 18)
    (9, 18)(20, 35)
    (2, 13)
    (2, 13)(20, 35)
    (2, 13)(15, 30)
    (1, 8)
    (1, 8)(20, 35)
    (1, 8)(15, 30)
    (1, 8)(9, 18)
    (1, 8)(9, 18)(20, 35)