Search code examples
prometheusmonitoringgrafanapromql

PromQL min_over_time + predict_linear / peaks monitoring


Context

The application running on the monitored disk intermittently doubles the amount of space it requires, causing disk utilization peaks that only last for a few seconds / minutes.

If the storage space is not enough to account for those peaks, the applications storage gets corrupted.

The monitoring examples I found only account for the usual amount of storage used, and sees the peaks as temporarily glitches.

enter image description here

What I tried (min_over_time + predict_linear : KO :-( )

I tried to use min_over_time(...[12h]) along with predict_linear, but it's not working due to a type error :

I currently have the following PromQL : predict_linear(node_filesystem_avail_bytes{job="nodeexporter",fstype!=""}[6h]

but what I actually want to monitor is the min_over_time(node_filesystem_avail_bytes{job="nodeexporter",fstype!=""}[12h]) instead of node_filesystem_avail_bytes{job="nodeexporter",fstype!=""} , to account for sporadic disk utilisation peaks. I thought it make sense, but unfortunately the following is not an accepted PromQL query:

predict_linear(min_over_time(node_filesystem_avail_bytes{job="nodeexporter",fstype!=""}[12h])[6h], 4*60*60)

Error :

Error executing query: invalid parameter "query": 1:94: parse error: ranges only allowed for vector selectors

questions

  • How would you monitor the disk utilization to take those peaks into account ?

  • How would you use min_over_time + predict_linear together ?

Thanks in advance for any clue or suggestion !


Solution

  • Just add :5m aftet 6h in the second square brackets. This enables subquery mode, which instructs Prometheus to execute the inner query (e.g. min_over_time(...) on the given time range per every 5 minute step and then to execute the outer query (e.g. predict_linear(...) on top of results returned by inner query.