Search code examples
springspring-bootspring-boot-actuatormicrometerspring-micrometer

SimpleMeterRegistry clears data if data not polled every minute


I have a simple spring boot app with the following config (the project is available here on GitHub):

management:
  metrics:
    export:
      simple:
        mode: step
  endpoints:
    web:
      exposure:
        include: "*"

The above config creates SimpleMeterRegistry and configures its metrics to be step-based, with 60 seconds step. I have one script that sends 50-100 requests per second to the service dummy endpoint and there's the other script that polls the data from /actuator/metrics/http.server.requests every X seconds. When I run the latter script every 60 seconds everything works as expected, but when the script is run every 120 seconds, the response always contains zeros for TOTAL_TIME and COUNT metrics.

Can anyone explain this behavior?

I have read the documentation here. The picture below enter image description here could indicate that a registry will try to aggregate the data for the previous interval only if pollAsRate is called during the current interval. This will explain why it does not work for 120 seconds interval. But this is just my assumption, does anyone know what is really happening here?

Spring boot version: 2.1.7.RELEASE

UPDATE

I did a similar test with management.metrics.export.simple.step=10s, it works fine when polling interval is 10s and not working when it is 20s. For 15s interval it sporadically works. So, it's definitely related to the step size and polling frequency.


Solution

  • Finally figured out what is happening.

    On every request to /actuator/metrics, MetricsEndpoint is going to merge measures (see here). That is done by collecting values for all meters with measurement.getValue(). The StepMeasurement.getValue() will not simply return the value, it will update the current and the previous intervals and counts, and roll the count (see here and here).

    StepMeasurement.getValue

    public double getValue() {
      double absoluteCount = (Double)this.f.get();
      double inc = Math.max(0.0D, absoluteCount - this.lastCount.sum());
      this.lastCount.add(inc);
      this.value.getCurrent().add(inc);
      return this.value.poll();
    }
    

    StepDouble.poll

    public double poll() {
        rollCount(clock.wallTime());
        return previous;
    }
    

    How is this related to the polling interval? If you do not poll /actuator/metrics endpoint, the current and previous intervals will not be updated, thus resulting in the current interval not being up-to-date and metrics being recorded for the "wrong" interval.