Search code examples
prometheusprometheus-alertmanagerprometheus-node-exporter

Prometheus Alert Manager - CPU high not alerting


I configured prometheus alert manager, but he is not alerting when the CPU of one of my server goes to 99% of usage. This is the alert :

- alert: HostHighCpuLoad
  expr: avg(irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "High usage on {{ $labels.instance }}"
    description: "{{ $labels.instance }} has a average CPU idle (current value: {{ $value }}s)"

It looks like my expression, take the global average of all my servers, but i need to monitor this measure for every single server.

Someone already got this problem ?


Solution

  • Yes, it is considering the average of all instances. Change the expression to:

    avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30