I'm using a PromQL query to calculate the cumulative traffic pushed/received through some interfaces on any node on the last 60 minutes. With Prometheus Node Exporter's metrics:
delta(node_network_receive_bytes_total{device=~"ens.*"}[60m])*8
And it's perfectly fine as far as the node does not reboot in that interval, the value is simply the difference between the vector's tip and its tail. When the system reboots and the counter resets, the meaning of the function stops reflecting said result.
e.g. Being this the graph for node_network_transmit_bytes_total
:
... the function will return -9MiB, instead of 10.2MiB.
I guess I can play out with rate()
s to get an estimate by also using the time. But is there a better function/way to get the actual thing?
As indicated in the documentation of delta():
delta should only be used with gauges.
You should be using the increase() function which is specific to counters.
Breaks in monotonicity (such as counter resets due to target restarts) are automatically adjusted for.
This is one of the main reason to distinguish between gauge and counters. See this answer about the difference.
You can identify counters by one of the following methods:
# TYPE http_requests_total counter
)_total
(if exporter respects best practices)