We have Spring Boot micro-services running with Java 17 inside Kubernetes, they have launch parameters: -Xms512m
and -Xmx512m
. I see the following unclear case by monitoring Prometheus metrics in Grafana:
tenured gen space (old generation) continuously linearly grows from 50 to 100% with cycles for 30 hours, then drops back to 50-60% of free space, and repeats the same cyclic behavior again and again.
This is based on metrics result of jvm_memory_used_bytes{id="Tenured Gen"}
divided by jvm_memory_max_bytes{id="Tenured Gen"}
. The same value has metric jvm_memory_usage_after_gc_percent
(it equals to a previously mentioned division result). Upon reaching 100%, we have a major GC pause with a duration of 0.4-1.0 second (based on metric jvm_gc_pause_seconds_max
), and after that rate of tenured space drops.
Tenured max space is 341m (2/3 out of 512m). Other services have similar behavior with differences only in cycle duration, e.g. 5 hours. As GC cleans up tenured space to 50% of free space, I think it means we don't have any memory leaks and objects inside tenured gen are eligible to be cleaned up (unreachable). And looks like nothing suspicious with young gen space, there GC clean up occurs frequently with minor pauses. I think it's a good idea to monitor these metrics and fire alerts by reaching threshold like 90%.
Questions:
jvm_memory_usage_after_gc_percent
will not reach some threshold, e.g. equal to 90%, with more frequent major GC clean ups (and with smaller major GC pauses)?-XX:**
) to
execute ongoing major clean ups more frequently by reaching some
threshold?Update:
These services are using Serial GC, JVM selected this GC algorithm as a default one based on ergonomics and not G1 GC (thanks a lot to @Stephen to pointing that such behavior not relates to G1 GC). After switching to G1 GC by explicitly specifying JVM option -XX:+UseG1GC
, behavior is absolutely different and what I expected initially, without any reaching 100% of tenured gen max space.
Answering my question based on provided comments and research (thanks to all who were involved in the discussion).
In the described case, JVM automatically selected Serial GC and not G1 GC. G1 GC has absolutely different behavior related to the frequency of tenured space clean up.
Q&A
Does it somehow indicate that it's not enough max heap size for application and I should increase it?
Should I expect that in an ideally configured application jvm_memory_usage_after_gc_percent
will not reach some threshold, e.g. equal to 90%, with more frequent major GC clean ups (and with smaller major GC pauses)?
Does exist any GC command/hint (-XX:**
) to execute ongoing major clean ups more frequently by reaching some threshold?
-XX:InitiatingHeapOccupancyPercent
which defines the percentage of the entire heap occupancy that triggers garbage collection (default value is 45).Alternatives to change behavior: switch to G1 GC (JVM option -XX:+UseG1GC
) or ZGC (-XX:+UseZGC
).