Search code examples
dockermonitoringprometheuscadvisor

Prometheus cAdvisor docker monitoring


I've setup a docker monitoring stack using Prometheus, Grafana and cAdvisor. While using this query to get running containers:

count_scalar(container_last_seen{name=~container1|container2})

It picks up the containers allright, as soon as i launch a new container it is picked up right away. The problem is when a container is stopped or removed it does not pick it up, it still shows it as a running container.

From the cAdvisor/metrics endpoint it is removed as soon as the container stops.

Is there something wrong with the query?

(this is what i used for the stack: https://github.com/vegasbrianc/prometheus)


Solution

  • It seems to be related to the amount of time cAdvisor stores the data in memory.

    While cAdvisor keeps the data in memory, you still have a valid date in container_last_seen metric. So the count_scalar instruction still 'sees' the container as it has a valid value.

    In my test setup, cAdvisor keeps the data during 5 minutes. After this duration, I get the right information out of your formula because the container_last_seen metric has disappeared.

    You can change this cAdvisor configuration with the --storage_duration flag.

    --storage_duration=2m0s: How long to store data.
    

    As an alternative if you wan't quick alerting, you could also consider running a query that would compare last seen date with current date:

    count_scalar(time()-container_last_seen{name=~"container1|container2"}<=60)