Search code examples
prometheusprometheus-alertmanager

Prometheus alert for execution time of a histogram metric with multiple values


A histogram metric has been set up so that the amount of time it takes for certain methods to run can be measured.

        static final Histogram duration = Histogram.build()
            .name("controller_method_duration")
            .help("Execution time of methods")
            .labelNames("controller", "method")
            .exponentialBuckets(0.005, 4, 8)
            .register();

...
        Timer timer = duration.labels("c1","m2").startTimer();

...
        timer.observeDuration();

Now, I want to add an alert that will check to see if the time it takes to execute the command is more than 300 milliseconds.

How can I define this alert? Is it possible to display the specifics of the incident in order to determine which methods exceeded?


Solution

  • Regardless of all the other fields that you need to add for AlertManager, the expression should be as below:

    sum(rate(controller_method_duration_sum{controller=~".*controllerName"}[1m])) > 0.3