My use case here is I have a bunch of devices, and I want to create only one alert that notifies me if a device goes offline suddenly. I can use absent()
if I create an alert per device, but I want only one alert that will tell me which device went offline.
in prometheus you can only work with metrics that are there, not wit metrics that are not there. You can timeshift your queries though and work with that.
You could try the following
(
someMetric{}
unless
(someMetric{} - 100000) offset 3m
) < 50000
The expression in parantheses will return only a value if only one of the parts return a value. So it will return a values only when the metric appears (in that case only the first part returns a value while the second one does not) or the metric disappers (here the secon part is present). By substracting an arbitrary value you can distinuish between both cases, so if you return a value in the 'normal' value range, you know that the metric appeared. when you get a very small number, you know the metric disappeared. Based on that, you can create an alert on that expression returning a very small value (Adapt the value if required).
NOTE: this alert will only last for 3 minutes (in this example) as the expression will no longer return a value after 3 minutes. If you increase the offset to say 1h
, the metric will stay longer, though, you might detect that a metric disappers only after a some time and the alert will only last as long as the metric has been present.
case 1: the metric was there longer then the offset
metric ---------------------
offsetted metric ---------------------
alert xxxxxxxxxxxx
case 2: the metric is only present for a short time
metric -----
offsetted metric -----
alert xxxxx