Search code examples
scalaparallel-collections

Thread pool has less number of threads than what is set (in Scala)


I am using parallel collection to run some code in parallel. Here is the code

val threadIds = new ConcurrentSkipListSet[Long]()
val pool = new ForkJoinPool(250)
val forkJoinSupport = new ForkJoinTaskSupport(pool)
list.par.taskSupport = forkJoinSupport
list.par.map{ element =>
  threadIds.add(Thread.currentThread().getId)
  ...
}
println(s"""No of actual threads in pool: ${threadIds.size()}: Threads = ${threadIds.asScala.mkString(",")}""")

The output from the println statement always is 64 whereas the expected thread count is 250

No of actual threads in pool: 64: Threads = ...

Am I missing something here?

Note: The machine in which this application runs has 8 cores.


Solution

  • As mentioned in a comment, you are not reading the size of the pool, but rather that of the collection (and again as mentioned there, you are creating two separate parallel collections by invoking par twice and working on them as if they were one). Furthermore, the ForkJoinPool makes no guarantees with regards to the size of the pool when the task queue is empty. As the following Scala shell session shows, threads are spun up lazily based on whether they are needed, and they are capped to the level of parallelism you ask for at construction:

    scala> import java.util.concurrent.ForkJoinPool
    import java.util.concurrent.ForkJoinPool
    
    scala> val pool = new ForkJoinPool(250)
    val pool: java.util.concurrent.ForkJoinPool = java.util.concurrent.ForkJoinPool@5e2a6991[Running, parallelism = 250, size = 0, active = 0, running = 0, steals = 0, tasks = 0, submissions = 0]
    
    scala> pool.getPoolSize
    val res3: Int = 0
    
    scala> val sleep: Runnable = () => while (true) Thread.sleep(1000)
    val sleep: Runnable = $Lambda$1182/0x0000000840644040@8585cdd
    
    scala> for (_ <- 1 to 50) pool.execute(sleep)
    
    scala> pool.getPoolSize
    val res5: Int = 51
    
    scala> pool.getPoolSize
    val res6: Int = 51
    
    scala> for (_ <- 1 to 200) pool.execute(sleep)
    
    scala> pool.getPoolSize
    val res8: Int = 250
    
    scala> for (_ <- 1 to 200) pool.execute(sleep)
    
    scala> pool.getPoolSize
    val res10: Int = 250