This little code snippet never finishes on jdk8u45, and used to finish properly on jdk8u20:
public class TestForkJoinPool {
final static ExecutorService pool = Executors.newWorkStealingPool(8);
private static volatile long consumedCPU = System.nanoTime();
public static void main(String[] args) throws InterruptedException {
final int numParties = 100;
final Phaser p = new Phaser(1);
final Runnable r = () -> {
p.register();
p.arriveAndAwaitAdvance();
p.arriveAndDeregister();
};
for (int i = 0; i < numParties; ++i) {
consumeCPU(1000000);
pool.submit(r);
}
while (p.getArrivedParties() != numParties) {}
}
static void consumeCPU(long tokens) {
// Taken from JMH blackhole
long t = consumedCPU;
for (long i = tokens; i > 0; i--) {
t += (t * 0x5DEECE66DL + 0xBL + i) & (0xFFFFFFFFFFFFL);
}
if (t == 42) {
consumedCPU += t;
}
}
}
The doc of phaser states that
Phasers may also be used by tasks executing in a ForkJoinPool, which will ensure sufficient parallelism to execute tasks when others are blocked waiting for a phase to advance.
However the javadoc of ForkjoinPool#mangedBlock states:
If running in a ForkJoinPool, the pool may first be expanded to ensure sufficient parallelism
Only a may there. So I am not sure whether or not this is a bug, or just bad code that is not relying on the contract of the Phaser/ForkJoinPool: how hard does the contract of the combination Phaser/ForkJoinPool works to prevent deadlocks?
My config:
It looks like your problem comes from a change in the ForkJoinPool code between JDK 8u20 and 8u45.
In u20, ForkJoin threads were always alive for at least 200 milliseconds (see ForkJoinPool.FAST_IDLE_TIMEOUT) before being reclaimed.
In u45, once the ForkJoinPool has reached its target parallelism plus 2 extra threads, threads will die as soon as they run out of work without waiting. You can see this change in the the awaitWork method in ForkJoinPool.java (line 1810):
int t = (short)(c >>> TC_SHIFT); // shrink excess spares
if (t > 2 && U.compareAndSwapLong(this, CTL, c, prevctl))
return false;
Your program uses Phasers tasks to create extra workers. Each task spawns a new compensating worker that is meant to pick up the next submitted task.
However, once you reach the target parallelism + 2, the compensating worker dies immediately without waiting and does not have a chance to pick up the task that will be submitted immediately afterwards.
I hope this helps.