Search code examples
prometheusprometheus-alertmanagerprometheus-blackbox-exporter

AlertManager downtime alert unless 429 (To Many Requests) HTTP status code


Currently I have an AlertManager config that simply sends an alert when the "probe_success" metric is 0.

I don't know how I could join the "probe_http_status_code" metric with the "probe_success" metric in the "expr" field of an alert rule to keep the alert from firing when the "probe_success" metric is 0 because of a 429 (To Many Requests) HTTP status code.

I tried to figure this out using the similar question below, but no luck.
How can I 'join' two metrics in a Prometheus query?

"probe_success" and "probe_http_status_code" are both Blackbox Exporter metrics.


Solution

  • What you probably want here is valid_status_codes, so you can specify 429 (plus whatever 2xx codes are expected) as valid which will keep probe_success as 1 when they happen.