Search code examples
project-planningoptaplannerresource-scheduling

Is this an intelligent use case for optaPlanner?


I'm trying to clean up an enterprise BI system that currently is using a prioritized FIFO scheduling algorithm (so a priority 4 report from Tuesday will be executed before priority 4 reports from Thursday and priority 3 reports from Monday.) Additional details:

  • The queue is never empty, jobs are always being added
  • Jobs range in execution time from under a minute to upwards of 24 hours
  • There are 40 some odd identical app servers used to execute jobs

I think I could get optaPlanner up and running for this scenario, with hard rules around priority and some soft rules around average time in the queue. I'm new to scheduling optimization so I guess my question is what should I be looking for in this situation to decide if optaPlanner is going to help me or not?


Solution

  • The problem looks like a form of bin packing (and possibly job shop scheduling), which are NP-complete, so OptaPlanner will do better than a FIFO algorithm.

    But is it really NP-complete? If all of these conditions are met, it might not be:

    • All 40 servers are identical. So running a priority report on server A instead of server B won't deliver a report faster.
    • All 40 servers are identical. So total duration (for a specific input set) is a constant.
    • Total makespan doesn't matter. So given 20 small jobs of 1 hour and 1 big job of 20 hours and 2 machines, it's fine that it takes all small jobs are done after 10 hours before the big job starts, given a total makespan of 30 hours. There's no desire to reduce the makespan to 20 hours.
    • "the average time in the queue" is debatable: do you care about how long the jobs are in the queue until they are started or until they are finished? If the total duration is a constant, this can be done by merely FIFO'ing the small jobs first or last (while still respecting priority of course).
    • There are no dependencies between jobs.

    If all these conditions are met, OptaPlanner won't be able to do better than a correctly written greedy algorithm (which schedules the highest priority job that is the smallest/largest first). If any of these conditions aren't met (for example you buy 10 new servers which are faster), then OptaPlanner can do better. You just have to evaluate if it's worth spending 1 thread to figure that out.

    If you use OptaPlanner, definitely take a look at real-time scheduling and daemon mode, to replan as new reports enter the system.