Search code examples
javascalathreadpoolactorakka

Which ThreadPool in Java should I use?


There are a huge amount of tasks. Each task is belong to a single group. The requirement is each group of tasks should executed serially just like executed in a single thread and the throughput should be maximized in a multi-core (or multi-cpu) environment. Note: there are also a huge amount of groups that is proportional to the number of tasks.

The naive solution is using ThreadPoolExecutor and synchronize (or lock). However, threads would block each other and the throughput is not maximized.

Any better idea? Or is there exist a third party library satisfy the requirement?


Solution

  • A simple approach would be to "concatenate" all group tasks into one super task, thus making the sub-tasks run serially. But this will probably cause delay in other groups that will not start unless some other group completely finishes and makes some space in the thread pool.

    As an alternative, consider chaining a group's tasks. The following code illustrates it:

    public class MultiSerialExecutor {
        private final ExecutorService executor;
    
        public MultiSerialExecutor(int maxNumThreads) {
            executor = Executors.newFixedThreadPool(maxNumThreads);
        }
    
        public void addTaskSequence(List<Runnable> tasks) {
            executor.execute(new TaskChain(tasks));
        }
    
        private void shutdown() {
            executor.shutdown();
        }
    
        private class TaskChain implements Runnable {
            private List<Runnable> seq;
            private int ind;
    
            public TaskChain(List<Runnable> seq) {
                this.seq = seq;
            }
    
            @Override
            public void run() {
                seq.get(ind++).run(); //NOTE: No special error handling
                if (ind < seq.size())
                    executor.execute(this);
            }       
        }
    

    The advantage is that no extra resource (thread/queue) is being used, and that the granularity of tasks is better than the one in the naive approach. The disadvantage is that all group's tasks should be known in advance.

    --edit--

    To make this solution generic and complete, you may want to decide on error handling (i.e whether a chain continues even if an error occures), and also it would be a good idea to implement ExecutorService, and delegate all calls to the underlying executor.