Search code examples
prometheuspromqlprometheus-alertmanager

Check if any value is above treshold in a `sum by` query


I have the following query which results in the error rate per endpoint and method. Is there any way to create an alert in alertmanager (preferably with the endpoint and method name in the alert body) if any value is above a certain treshold i.e. 10%?

sum by (endpoint, method) (
    http_requests_received_total{code=~"5.."} / 
    http_requests_received_total
)

Solution

  • You can get inspiration from the following PrometheusRule array:

        - alert: "APIErrorRateIsHigh"
          annotations:
            summary: "Error rate is high"
            description: 'Error rate is higher than 10% on {{ $labels.method }} - {{ $labels.endpoint }}\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}'
          expr: |-
            (
              sum by (endpoint, method)
              (
                rate(http_requests_received_total{code=~"5.."}[5m])
                  / 
                rate(http_requests_received_total[5m])
              )
            ) > 0.1
          for: 0m
          labels:
            severity: high
    

    I would advise you to test the query in prometheus, and set the threshold to something easier to trigger for testing, like 0.0001. Then you can write your alert rule.