Search code examples
javagarbage-collectionjava-11zgc

Getting 'allocation stall' when enabling ZGC


I am testing the new zgc garbage collector that was included in java 11 since it promises really low latency. Our application is a real-time service that creates and destroys many objects per second and it does it in a multi-threaded environment using akka.

When enabling zgc by passing the options -XX:+UnlockExperimentalVMOptions -XX:+UseZGC and enabling gc logs, we can see many messages in the log similar to this:

[2020-05-20T18:05:36.563+0000][63.851s][info ][gc] Allocation Stall (Main-akka.remote.default-remote-dispatcher-6) 11332.231ms
[2020-05-20T18:05:36.563+0000][63.851s][info ][gc] Allocation Stall (Main-akka.remote.default-remote-dispatcher-26) 9898.046ms
[2020-05-20T18:05:36.563+0000][63.851s][info ][gc] Allocation Stall (Main-io-blocking-dispatcher-52) 12133.240ms
[2020-05-20T18:05:36.563+0000][63.851s][info ][gc] Allocation Stall (Main-akka.actor.default-dispatcher-54) 9002.299ms
[2020-05-20T18:05:36.563+0000][63.850s][info ][gc] Allocation Stall (Main-io-blocking-dispatcher-50) 12134.218ms
[2020-05-20T18:05:36.563+0000][63.850s][info ][gc] Allocation Stall (Main-akka.actor.default-dispatcher-46) 12132.540ms
[2020-05-20T18:05:36.563+0000][63.851s][info ][gc] Allocation Stall (Main-akka.actor.default-dispatcher-56) 8072.664ms

And after some seconds the JVM exits, not giving any reason. We are running openjdk-java-11. Any suggestions about what to do to make this work?


Solution

  • Allocation stall means a thread is asking for heap and none is available and so the requesting thread is blocking until heap becomes available.

    Make sure you have enough gc threads setup. JDK can have trouble detecting core count particularly if using Docker, which the default value of gc threads is derived from. See https://wiki.openjdk.java.net/display/zgc/Main#Main-SettingConcurrentGCThreads

    If your CPU util is low during these times, that's another indication that you need more GC threads.

    In general, enabling hugepages can help performance with ZGC. https://wiki.openjdk.java.net/display/zgc/Main#Main-EnablingLargePagesOnLinux

    Also, you may just need more heap.

    EDIT TO ADD: Probably also worthwhile to make sure you're on the latest patch release of jdk and the OS.