Search code examples
java-8jvmgarbage-collection

Is the performance of Shenandoah GC in JDK8 worse than G1?


I use the image provied by red hat to build the application write in jdk8.
Here is the Dockerfile

From registry.access.redhat.com/ubi8/openjdk-8-runtime:latest
COPY benchmark-1.0.0.jar /benchmark.jar
EXPOSE 8080
ENV JAVA_OPTS="\
-server \
-XX:+UseShenandoahGC \
-Xms1g \
-Xmx1g \
-XX:MetaspaceSize=256m \
-XX:MaxMetaspaceSize=256m \
-verbose:gc \
-XX:+PrintGCDetails \
-XX:ConcGCThreads=2 \
-Xloggc:/home/jboss/gc.log \
-XX:+PrintGCDateStamps \
-XX:+AlwaysPreTouch"

ENTRYPOINT java ${JAVA_OPTS} -Djava.security.egd=file:/dev/./urandom -jar /benchmark.jar
From java:openjdk-8u111-jre-alpine
COPY benchmark-1.0.0.jar /benchmark.jar 
EXPOSE 8080
ENV JAVA_OPTS="\
-server \
-XX:+UseG1GC \
-Xms1g \
-Xmx1g \
-verbose:gc \
-XX:+PrintGCDetails \
-Xloggc:/gc.log \
-XX:+PrintGCDateStamps \
-XX:+AlwaysPreTouch"

Then I used Jmeter tool to perform stress testing, I set 200 working threads and a duration of 10 minutes.

Here is the application code:

public class GCController {

    // Simulating constants, caches, and other objects in application services.
    private static  final int[] CONSTANT  = new int[1024 * 1024 * 125];

    @GetMapping("/allocate")
    public void allocate(){
        try {
            // Assmuing each request cost 1M.
            int[] allocate = new int[1024 * 256];
            allocate[0] = 1;
            allocate[2] = 2;
        }catch (Exception e){
            // ignore
        }
    }
}

The configuration of the machine used for testing is 2 CPUs, 2GB of memory, and 4M of bandwidth.

enter image description here If I do the test, the usage of cpu will approach to 100%

The result of the experiment shows that G1 performs better than Shenandoah. Is there something wrong with my experiment? Shenandoah GC log

G1 GC log


Solution

  • Low pause collectors like ShenandoahGC and ZGC aren't designed to out-perform G1GC or ParallelGC. They're designed to have low (or nearly zero) JVM pauses, and so they're better suited for real time workloads. "Real time" doesn't imply fast, it just means that overall performance is more predicable.