Search code examples
javamultithreadingexecutorservicefork-joinspliterator

Simple multi-threaded Java app - ExecutorService? Fork/Join? Spliterators?


I am writing a command-line application in Java 8. There's a part that involves some computation, and I believe it could benefit from running in parallel using multiple threads. However, I have not much experience in writing multi-threaded applications, so I hope you could steer me in the right direction how should I design the parallel part of my code.

For simplicity, let's pretend the method in question receives a relatively big array of longs, and it should return a Set containing only prime numbers:

public final static boolean checkIfNumberIsPrime(long number) {
  // algorithm implementation, not important here
  // ...
}

// a single-threaded version
public Set<Long> extractPrimeNumbers(long[] inputArray) {
  Set<Long> result = new HashSet<>();
  for (long number : inputArray) {
    if (checkIfNumberIsPrime(number)) {
      result.add(number);
    }
  }
  return result;
}

Now, I would like to refactor method extractPrimeNumbers() in such way that it would be executed by four threads in parallel, and when all of them are finished, return the result. Off the top of my head, I have the following questions:

  • Which approach would be more suitable for the task: ExecutorService or Fork/Join? (each element of inputArray[] is completely independent and they can be processed in any order whatsoever)
  • Assuming there are 1 million elements in inputArray[], should I "ask" thread #1 to process all indexes 0..249999, thread #2 - 250000..499999, thread #3 - 500000..749999 and thread #4 - 750000..999999? Or should I rather treat each element of inputArray[] as a separate task to be queued and then executed by an applicable worker thread?
  • If a prime number is detected, it should be added to `Set result, therefore it needs to be thread-safe (synchronized). So, perhaps it would be better if each thread maintained its own, local result-set, and only when it is finished, it would transfer its contents to the global result, in one go?
  • Is Spliterator of any use here? Should they be used to partition inputArray[] somehow?

Solution

  • Parallel stream

    Use none of these. Parallel streams are going to be enough to deal with this problem much more straightforwardly than any of the alternatives you list.

    return Arrays.parallelStream(inputArray)
      .filter(n -> checkIfNumberIsPrime(n))
      .boxed()
      .collect(Collectors.toSet());
    

    For more info, see The Java™ Tutorials > Aggregate Operations > Parallelism.