I would like to create a Prometheus alert that sends a firing alert every minute and then resolves itself and sends a resolved alert. What i am instead seeing is that the alert stays firing instead of ever becoming resolved.
This is the rule file:
groups:
- name: example
rules:
- alert: 'flipping rule'
expr: minute() % 2
for: 30s
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- 192.168.8.158:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "prom-rule.yaml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
relabel_configs:
- source_labels: [branch]
regex: HEAD
action: drop
- job_name: "nginx-exporter"
static_configs:
- targets: ["192.168.8.158:9113"]
- job_name: "node-exporter"
static_configs:
- targets: ["localhost:9100"]
metric_relabel_configs:
- regex: 'node_arp_entries'
source_labels: [__name__]
action: keep
- regex: 'node_boot_time_seconds'
source_labels: [__name__]
action: keep
- job_name: "cadvior"
static_configs:
- targets: ["localhost:9999"]
These photos show that the alert just stays active instead of flipping up and down every minute like I would expect
Adding an explicit threshold to the expression for your rule should solve the issue, like this:
expr: minute() % 2 == 0