Search code examples
dockerprometheusgrafanadocker-swarmprometheus-node-exporter

How to configurate prometheus.yml to scrape only running containers for node-exporter


I have a problem with the grafana/prometheus when I used node-exporter to collect host's resources from docker swarm nodes.

I tested with only one swarm node. When I used the query
label_values(node_uname_info{job="node-exporter"}, instance) in Grafana variables. The result returned the old ip of stopped containers and the ips of running container as well. I want it only returns the ip of running container. You can see the image below, it shows the ip of node-exported containers all the time.

enter image description here

But actually, one one container is running with the ip 10.0.1.12:9100. The other ips were the old ip of node-exporter containers that started and stopped. Here is the time-series that these contianer were created. enter image description here

I think we can configurate the scrape method in prometheus.yml with the #relabel_config but I am not familiar with it. Here is the scrape method I got from https://github.com/stefanprodan/swarmprom.

  - job_name: 'node-exporter'
    dns_sd_configs:
    - names:
      - 'tasks.node-exporter'
      type: 'A'
      port: 9100

Do you know how to filter the only running containers by adding some attribute in prometheus.yml. Thank you so much for your consideration.


Solution

  • Based on the last comment, you can modify the queries using the following pattern:

    min ignoring (instance) (<query without instance>)
    

    so the (example) query

    rate(cpu_time_seconds{instance="$instance", otherLabel="otherValue"}[5m])
    

    becomes

    min without (instance) (rate(cpu_time_seconds{otherLabel="otherValue"}[5m])
    

    The aggregation function is relatively irrelevant here, as you only have one value at a time.

    Additionally you can remove the instance variable from the dashboard