Search code examples
prometheusprometheus-alertmanager

Multiple scrape jobs running on same target in Prometheus


I'm working on use case where I need to scrape metrics at different scrape intervals. For example, metric_one needs to be scraped for every 1 hr and metric_two is scraped for every 15s. Both of these metrics are from the same target. The solution that I have tried is following:

global:
scrape_interval: 15s
scrape_configs:
 - job_name: 'job_two'
   static_configs:
     - targets:
       - targets: ["engine:5001"]
   metric_relabel_configs:
   - source_labels: [ __name__ ]
     regex: 'metric_one'
     action: drop
 - job_name: 'job_one'
   scrape_interval: 3600s
   static_configs:
     - targets:
       - targets: ["engine:5001"]
   metric_relabel_configs:
   - source_labels: [ __name__ ]
     regex: 'metric_two'
     action: drop

This Prometheus config tries to run two different scrape jobs on same host. job_one only scrapes metric_one and job_two only scrapes metric_one, but the Prometheus service discovery does not work properly with job_two and heath-check status is shown unknown and sometimes it doesn't appear at all when I run up in Prometheus console. Is there any other proper solution to this or am I doing something wrong in this solution?


Solution

  • The easiest solution is to decrease the scrape interval to 5 minutes or less.

    This is something like Prometheus internals. PromQL only looks back for 5 minutes to find samples, so after this amount of time since the last scrape you start seeing the described behavior. It is easy to see on a graph, you will quickly notice that the metric like breaks shortly after the last scrape. In order to prevent this, you need to scrape the metric at least every five minutes, even if the metric value does not change that often. This is just to tell Prometheus that the metric is still there.