Search code examples
prometheusgrafanaprometheus-node-exporterthanos

Find exact CPU percentage from the metrics exported by prometheus-node-exporter


I use the node_cpu_seconds_total metrics for this.

Basically, I want to subtract mode="idle" from the total CPU usage and then take the avg rate of the result, then a percentage calculation.

I tried something like:

100 - (avg(rate(node_cpu_seconds_total{instance="ip-X-X-X-X.eu-west-1.compute.internal:9100",job="rabbitmq-prod-node-exporter",replica="prometheus-prod"} - node_cpu_seconds_total{instance="ip-X-X-X-X.eu-west-1.compute.internal:9100",mode="idle",job="rabbitmq-prod-node-exporter",replica="prometheus-aws-prod"}))[1m] * 100)

But does not seem to be proper and also shows a parse error:

Error executing query: parse error at char 177: range specification must be preceded by a metric selector, but follows a *promql.AggregateExpr instead

Solution

  • To fix your PromQL change it to the following:

    100 - (avg(rate(node_cpu_seconds_total{instance="INSTANCE",job="JOB",replica="REPLICA"}[1m])) - avg(rate(node_cpu_seconds_total{instance="INSTANCE",mode="idle",job="JOB",replica="REPLICA"}[1m])) * 100)
    

    But it's better to use "irate" instead of "rate" and use the following simpler PromQL:

    100 - 100 * (avg(irate(node_cpu_seconds_total{instance="INSTANCE",job="JOB",replica="REPLICA",mode="idle"}[1m])))