Search code examples
prometheuspromqlprometheus-node-exporter

Alert if data in certain labels is missing in Prometheus


I am sending data related to two disks to prometheus. I want to alert if metrics of one disk stops sending metrics. Say I have diskA and diskB and I am collecting disk_up metric. Now diskB failed. In prometheus

disk_up{disk="diskA"} will have data and disk_up{disk="diskB"} will be missing

absent(disk_up) will be 0 since disk_up have diskA's data. absent(disk_up{disk="diskB"}) will serve the purpose. But I don't want to hardcode the disk names.

Can I know which is the better way to setup an alert for this scenario.


Solution

  • You could use something like this:

    max_over_time(disk_up[1h])
      unless
    disk_up
    

    I.e. the metric existed at any time during the past 1 hour but doesn't exist now.

    You will get a false positive if a disk_up metric pops up for some diskC, though. Or if the metric gets or loses one label due to the exporter or your Prometheus configuration.

    You can avoid the former by explicitly filtering for the disks/instances/whatever you are interested in, but that would defeat your goal of not hardcoding them. It is probably the wiser thing to do though:

    max_over_time(disk_up{disk~="disk(A|B)"}[1h])
      unless
    disk_up{disk~="disk(A|B)"}
    

    Or at least

    max_over_time(disk_up{job="my_disk_job"}[1h])
      unless
    disk_up{job="my_disk_job"}