Search code examples
javagarbage-collection

Java garbage collector does not clean up heap tenured gen space until it reaches 100%


We have Spring Boot micro-services running with Java 17 inside Kubernetes, they have launch parameters: -Xms512m and -Xmx512m. I see the following unclear case by monitoring Prometheus metrics in Grafana: tenured gen space (old generation) continuously linearly grows from 50 to 100% with cycles for 30 hours, then drops back to 50-60% of free space, and repeats the same cyclic behavior again and again. This is based on metrics result of jvm_memory_used_bytes{id="Tenured Gen"} divided by jvm_memory_max_bytes{id="Tenured Gen"}. The same value has metric jvm_memory_usage_after_gc_percent (it equals to a previously mentioned division result). Upon reaching 100%, we have a major GC pause with a duration of 0.4-1.0 second (based on metric jvm_gc_pause_seconds_max), and after that rate of tenured space drops.

Tenured max space is 341m (2/3 out of 512m). Other services have similar behavior with differences only in cycle duration, e.g. 5 hours. As GC cleans up tenured space to 50% of free space, I think it means we don't have any memory leaks and objects inside tenured gen are eligible to be cleaned up (unreachable). And looks like nothing suspicious with young gen space, there GC clean up occurs frequently with minor pauses. I think it's a good idea to monitor these metrics and fire alerts by reaching threshold like 90%.

Questions:

  • Does it somehow indicate that it's not enough max heap size for application and I should increase it?
  • Should I expect that in an ideally configured application jvm_memory_usage_after_gc_percent will not reach some threshold, e.g. equal to 90%, with more frequent major GC clean ups (and with smaller major GC pauses)?
  • Does exist any GC command/hint (-XX:**) to execute ongoing major clean ups more frequently by reaching some threshold?

Update: These services are using Serial GC, JVM selected this GC algorithm as a default one based on ergonomics and not G1 GC (thanks a lot to @Stephen to pointing that such behavior not relates to G1 GC). After switching to G1 GC by explicitly specifying JVM option -XX:+UseG1GC, behavior is absolutely different and what I expected initially, without any reaching 100% of tenured gen max space.


Solution

  • Answering my question based on provided comments and research (thanks to all who were involved in the discussion).

    In the described case, JVM automatically selected Serial GC and not G1 GC. G1 GC has absolutely different behavior related to the frequency of tenured space clean up.

    Q&A

    • Does it somehow indicate that it's not enough max heap size for application and I should increase it?

      • Nope, it does not indicate such issue. In our case, tenured space becomes 50% free after major GC clean up, so we are far away from OOM. Only in case we see a tendency that after each major GC pause, tenured space usage is really close to 100% and duration of GC pauses is increasing with high frequency, it will indicate that we are close to OOM and have issues.
    • Should I expect that in an ideally configured application jvm_memory_usage_after_gc_percent will not reach some threshold, e.g. equal to 90%, with more frequent major GC clean ups (and with smaller major GC pauses)?


      • It depends on GC algorithm, each of them has its own behavior.
    • Does exist any GC command/hint (-XX:**) to execute ongoing major clean ups more frequently by reaching some threshold?


      • JVM GC parameters are specific per GC algorithm. The Serial GC doesn't provide fine-grained control over the tuning parameters compared to more advanced collectors like G1 GC or ZGC. Serial collector does not have a direct parameter to specify a threshold for cleaning up the tenured generation space more frequently. G1 GC has parameter -XX:InitiatingHeapOccupancyPercent which defines the percentage of the entire heap occupancy that triggers garbage collection (default value is 45).

    Alternatives to change behavior: switch to G1 GC (JVM option -XX:+UseG1GC) or ZGC (-XX:+UseZGC).