Search code examples
prometheusspring-boot-actuatorpromql

Promql for getting percentage stats for http status codes


I would like to plot a graph and add it dashboard for displaying percentage of 2xx, 4xx and 5xx (and even grouped by specific http status codes like 401, 503, 502, 500 etc.) based on http status codes. I'm trouble having building a query to achieve the desired result. Can someone please help providing the query or suggestions?

I have tired the following query but getting results in not per my expectation.

sum(rate(http_server_requests_seconds_count{app="xxxx", status=~"2..", uri!~".*actuator.*"}[2m])) 
/ 
sum(rate(http_server_requests_seconds_count{app="xxxx", uri!~".*actuator.*"}[2m]))
* 100

Kindly assit. Thanks in advance!


Solution

  • In Grafana I've seen this normally done using a query for each status code group, but you could do something like this:

    sum by (new_code) (
      label_replace(
        rate(http_server_requests_seconds_count{app="xxxx", uri!~".*actuator.*"}[2m]),
        "new_code",
        "${1}xx",
        "status",
        "(^\\d).+"
      )
    ) / 
    scalar(sum(rate(http_server_requests_seconds_count{app="xxxx", uri!~".*actuator.*"}[2m]))) * 100
    

    First, we extract the first digit of the status code into the new_code label and then we aggregate over the prefix label with the sum by().

    Then we need to divide by the total number of requests, we make sure to ignore any label by transforming it into a scalar value using the scalar() function.

    You can check this example which points to the prometheus demo instance. Keep in mind that the example is using the caddy_http_request_duration_seconds_count metric because http_server_requests_seconds_count is not present there.