I am sending data related to two disks to prometheus. I want to alert if metrics of one disk stops sending metrics. Say I have diskA and diskB and I am collecting disk_up metric. Now diskB failed. In prometheus
disk_up{disk="diskA"}
will have data and
disk_up{disk="diskB"}
will be missing
absent(disk_up)
will be 0 since disk_up have diskA's data.
absent(disk_up{disk="diskB"})
will serve the purpose. But I don't want to hardcode the disk names.
Can I know which is the better way to setup an alert for this scenario.
You could use something like this:
max_over_time(disk_up[1h])
unless
disk_up
I.e. the metric existed at any time during the past 1 hour but doesn't exist now.
You will get a false positive if a disk_up
metric pops up for some diskC
, though. Or if the metric gets or loses one label due to the exporter or your Prometheus configuration.
You can avoid the former by explicitly filtering for the disks/instances/whatever you are interested in, but that would defeat your goal of not hardcoding them. It is probably the wiser thing to do though:
max_over_time(disk_up{disk~="disk(A|B)"}[1h])
unless
disk_up{disk~="disk(A|B)"}
Or at least
max_over_time(disk_up{job="my_disk_job"}[1h])
unless
disk_up{job="my_disk_job"}