Search code examples
javajava-8phaserforkjoinpool

ForkJoinPool, Phaser and managed blocking: to what extent do they works against deadlocks?


This little code snippet never finishes on jdk8u45, and used to finish properly on jdk8u20:

public class TestForkJoinPool {

    final static ExecutorService pool = Executors.newWorkStealingPool(8);
    private static volatile long consumedCPU = System.nanoTime();

    public static void main(String[] args) throws InterruptedException {
        final int numParties = 100;
        final Phaser p = new Phaser(1);
        final Runnable r = () -> {
            p.register();
            p.arriveAndAwaitAdvance();
            p.arriveAndDeregister();
        };

        for (int i = 0; i < numParties; ++i) {
            consumeCPU(1000000);
            pool.submit(r);
        }

        while (p.getArrivedParties() != numParties) {}
    }

    static void consumeCPU(long tokens) {
        // Taken from JMH blackhole
        long t = consumedCPU;
        for (long i = tokens; i > 0; i--) {
            t += (t * 0x5DEECE66DL + 0xBL + i) & (0xFFFFFFFFFFFFL);
        }
        if (t == 42) {
            consumedCPU += t;
        }
    }
}

The doc of phaser states that

Phasers may also be used by tasks executing in a ForkJoinPool, which will ensure sufficient parallelism to execute tasks when others are blocked waiting for a phase to advance.

However the javadoc of ForkjoinPool#mangedBlock states:

If running in a ForkJoinPool, the pool may first be expanded to ensure sufficient parallelism

Only a may there. So I am not sure whether or not this is a bug, or just bad code that is not relying on the contract of the Phaser/ForkJoinPool: how hard does the contract of the combination Phaser/ForkJoinPool works to prevent deadlocks?


My config:

  1. Linux adc 3.14.27-100.fc19.x86_64 #1 SMP Wed Dec 17 19:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
  2. 8 cores i7

Solution

  • It looks like your problem comes from a change in the ForkJoinPool code between JDK 8u20 and 8u45.

    In u20, ForkJoin threads were always alive for at least 200 milliseconds (see ForkJoinPool.FAST_IDLE_TIMEOUT) before being reclaimed.

    In u45, once the ForkJoinPool has reached its target parallelism plus 2 extra threads, threads will die as soon as they run out of work without waiting. You can see this change in the the awaitWork method in ForkJoinPool.java (line 1810):

        int t = (short)(c >>> TC_SHIFT);  // shrink excess spares
        if (t > 2 && U.compareAndSwapLong(this, CTL, c, prevctl))
            return false; 
    

    Your program uses Phasers tasks to create extra workers. Each task spawns a new compensating worker that is meant to pick up the next submitted task.
    However, once you reach the target parallelism + 2, the compensating worker dies immediately without waiting and does not have a chance to pick up the task that will be submitted immediately afterwards.

    I hope this helps.