Search code examples
algorithmpuzzle

Find the smallest set of overlapping jobs


A friend gave me a puzzle that he says can be solved in better than O(n^3) time.

Given a set of n jobs that each have a set start time and end time (overlaps are very possible), find the smallest subset that for every job either includes that job or includes a job that has overlap with that job.

I'm pretty sure that the optimal solution is to pick the job with the most unmarked overlap, add it to the solution set, then mark it, and its overlap. And repeat until all jobs are marked.
Figuring out which job has the most unmarked overlappers is a simple adjacency matrix (O(n^2)), and this has to be redone every time a job is selected, in order to update the marks, making it O(n^3).

Is there a better solution?


Solution

  • Let A be the set of jobs which we haven't overlapped yet.

    1. Find the job x in A which has the minimal end time (t).
    2. From all jobs whose start time is less than t: pick the job j with the maximum end time.
    3. Add j to the output set.
    4. Remove all jobs which overlap j from A.
    5. Repeat 1-4 until A is empty.

    A simple implementation will run in O(n^2). Using interval trees it's probably possible to solve in O(n*logn).

    The basic idea behind why it's an optimal solution (not a formal proof): We have to pick one job whose start time is less than t, so that x will be overlapped. If we let S be the set of all jobs whose start time is less than t, it can be shown that j will overlap the same jobs as any job in S, plus possibly more. Since we have to pick one job in S, the best choice is j. We can use this idea to form a proof by induction on the number of jobs.