I'm using the official stable/prometheus-operator chart do deploy Prometheus with helm.
It's working good so far, except for the annoying CPUThrottlingHigh
alert that is firing for many pods (including the own Prometheus' config-reloaders containers). This alert is currently under discussion, and I want to silence its notifications for now.
The Alertmanager has a silence feature, but it is web-based:
Silences are a straightforward way to simply mute alerts for a given time. Silences are configured in the web interface of the Alertmanager.
There is a way to mute notifications from CPUThrottlingHigh
using a config file?
Well, I managed it to work by configuring a hackish inhibit_rule:
inhibit_rules:
- target_match:
alertname: 'CPUThrottlingHigh'
source_match:
alertname: 'DeadMansSwitch'
equal: ['prometheus']
The DeadMansSwitch
is, by design, an "always firing" alert shipped with prometheus-operator, and the prometheus
label is a common label for all alerts, so the CPUThrottlingHigh
ends up inhibited forever. It stinks, but works.
Pros:
alertmanager.config
helm parameter).CPUThrottlingHigh
alert is still present on Prometheus for
analysis.CPUThrottlingHigh
alert only shows up in the
Alertmanager UI if the "Inhibited" box is checked.Cons:
DeadMansSwitch
or the prometheus
label design will break this (which only implies the alerts firing again).Update: My Cons became real...
The DeadMansSwitch
altertname just changed in the stable/prometheus-operator 4.0.0. If using this version (or above), the new alertname is Watchdog
.