since i'm having a horrid time configuring the Alerting rules for the Prometheus Alertmanager, maybe someone can give me an hint in the right direction.
Here are the rules i'm currently trying to implement (taken straight from: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
rules.yml:
groups:
- name: example
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."
# Alert for any instance that has a median request latency >1s.
- alert: APIHighRequestLatency
expr: api_http_request_latencies_second{quantile="0.5"} > 1
for: 10m
annotations:
summary: "High request latency on {{ $labels.instance }}"
description: "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)"
with the amtool and promtool config check i'm getting the following error:
Checking '/etc/prometheus/rules.yml' FAILED: yaml: unmarshal errors:
line 1: field groups not found in type config.plain
amtool: error: failed to validate 1 file(s)
My first guess would be a wrong indentation or some other kind of yaml-syntax error. However i've tried with multiple Alerting rules and also with different files as well as editors (currently i'm using nano).The yaml has also been checked with multiple yaml Linters. But for the time being i've always had errors in the line of the one shown.
Any help or suggestion would greatly be appreciated!
prometheus, version 2.22.2 (branch: HEAD, revision: de1c1243f4dd66fbac3e8213e9a7bd8dbc9f38b2)
go version: go1.15.5
platform: linux/amd64
alertmanager, version 0.21.0 (branch: HEAD, revision: 4c6c03ebfe21009c546e4d1e9b92c371d67c021d)
go version: go1.14.4
yaml linters:
https://codebeautify.org/yaml-validator
https://onlineyamltools.com/validate-yaml
tested alerting rules:
https://onlineyamltools.com/validate-yaml
https://rakeshjain-devops.medium.com/prometheus-alerting-most-common-alert-rules-e9e219d4e949
https://github.com/vegasbrianc/prometheus/blob/master/prometheus/alert.rules
The unmarshal of groups
fails because it is supposed to be a list:
groups:
- name: GroupName
rules:
- alert: ...
See the documentation about recording rules which is the same as the alerting rules.
UPDATE after post was corrected
Your file seems to be correct. The command line is:
promtool check rules /etc/prometheus/rules.yml
I expect you used the command to check the config
and not the rules
.
Please note that amtool
validates AlertManager's config, not Prometheus'.