I want to monitor docker containers running on multiple servers lets say i have a,b servers and containers running inside them, now I add one server (d) I want to monitor all docker containers inside all servers (A,B) only from server c. I have configured the docker to expose logs on all servers followed this docker docs not using cAdvisor . The target status shows 'ok' on all the servers, but the problem is as expression is same for all the containers of docker Prometheus is not able differentiate between the servers can anyone share the sample Prometheus rule file with expression i.e number of stopped containers should not be less then x .This is my current rule file
groups:
- name: Server_A
rules:
- alert: Central_service_down
expr: engine_daemon_container_states_containers{state="running"} < 10
for: 50s
labels:
severity: critical
instance: <IP_of_A>:9323
annotations:
summary: "Monitor service non-operational"
description: "Demo Service {{ $labels.instance }} is down."
- name: Server_B
rules:
- alert: Central_service_down
expr: engine_daemon_container_states_containers{state="running"} < 10
for: 50s
labels:
severity: critical
instance: <IP_of_B>:9323
annotations:
summary: "Monitor service non-operational"
description: "Demo Service {{ $labels.instance }} is down."
as u can see expr: engine_daemon_container_states_containers{state="running"} < 10
is same for both server a and b how can i differentiate expr for both . please share sample alert file .Thanks in advance
I have added instance='ip'
to differentiate i.e
expr: engine_daemon_container_states_containers{instance="serverA",state="running"} < 10