I'm receiving a big file, with N entries. For each entry, I'm creating a new thread. I need to wait for all the N threads to be terminated.
At the beginning I was using Phaser, but its implementation is limited to 65K parties. So, is blowing up because N could be like 100K.
Then, I tried CountDownLatch. This works fine, very simple concept and very simple implementation. But I don't know the number of N.
Phaser is my solution, but is has that limit.
Any ideas?
This post is related: Flexible CountDownLatch?
Sounds like the problem you are trying to solve is processing a large amount of tasks as fast as possible and wait for the processing to finish.
The problem with processing a large amount of tasks simultaneously is that it could cause too many context switches and would essentially cripple your machine and slow the processing down above a certain (hardware dependent) number of concurrent threads. This means you need to have an upper limit on the concurrent working threads being executed.
Phaser and CountDownLatch are both synchronisation primitives, their purpose is providing access control to critical code blocks, not managing parallel execution.
I would use an Executor service in this case. It supports the addition of tasks (in many forms, including Runnable).
You can easily create an ExecutorService
using the Executors class. I'd recommend using a fixed size thread pool for this with 20-100 max threads - depending on how CPU intensive your tasks are. The more computation power required for a task, the less number of parallel threads can be processed without serious performance degradation.
There are multiple ways to wait for all the tasks to finish:
Future
instances returned by the submit
method and simply call get on all of them. This ensures that each of the tasks are executed by the time your loop finished.Executor
down, it depends on if you're writing a single shot application or a server which keeps running afterwards - in case of a server app, you'll definitely have to go with the previous approach.Finally, here is a code snippet illustrating all this:
List<TaskFromFile> tasks = loadFileAndCreateTasks();
ExecutorService executor = Executors.newFixedThreadPool(50);
for(TaskFromFile task : tasks) {
// createRunnable is not necessary in case your task implements Runnable
executor.submit(createRunnable(task));
}
// assuming single-shot batch job
executor.shutdown();
executor.awaitTermination(MAX_WAIT_TIME_MILLIS, TimeUnit.MILLISECONDS);