i am using process-exporter to monitor process, then alert when a process using too much CPU.
This is my monitor CPU code in prometheus dashboard
sum(rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance="$host", mode=~"system|user"}[20s])) by (groupname, instance)
i have try to write alert with this (test for 10% CPU first)
- name: process
rules:
- alert: CPUProcess
expr: sum(rate(namedprocess_namegroup_cpu_seconds_total[20s])) by (groupname, instance) > 10
for: 1m
labels:
severity: critical
annotations:
summary: "(instance {{ $labels.instance }}) use too much CPU"
description: "Process (instance {{ $labels.groupname }}) use high CPU"
But seem like it doesn't work (another alert can work normal), can you give me a advice, thank you.
fixed by changed to namedprocess_namegroup_cpu_seconds_total{groupname=~".+", mode=~"system"} > 10