I have a windows_exporter
query to monitor certain windows services.
(windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="running"} == 0) - ignoring(state) (windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="stopped"} == 0)
We have a lot of servers and I don't want to hardcode them, so I made the second part. In the second part are the services that need to be stopped. Unfortunately this is not possible, when you kill the service with Task Manager, it also goes into the stopped state.
ignoring(state) (windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="stopped"} == 0)
Now I have to do the second part of the query static (hardcoded), i.e. enter all servers where stopped is ok.
How-To to do this? Provide a list of servers and windows services where the stopped state is fine.
Here is the output from Prom Console
windows_service_state{app="xxx", component="MercoServerAdmin", env="Prod", instance="otcs-olap-adm1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerAdmin", env="Prod", instance="otcs-olap-adm1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Int", instance="otcs-int-f1.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Int", instance="otcs-int-f2.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f2.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f3.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Test", instance="otcs-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcenteragent", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndex", env="Prod", instance="otcs-olap-idx1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndex", env="Prod", instance="otcs-olap-idx1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdmin", env="Int", instance="otcs-int-idxad1.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdmin", env="Int", instance="otcs-int-idxad1.internal-08.example.org", job="yyy-servers-int", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcenteragent", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Int", instance="otcs-int-src.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Int", instance="otcs-int-src.internal-08.example.org", job="yyy-servers-int", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src2.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src2.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Int", instance="otds-int-f1.internal-08.example.org", job="yyy-servers-int", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Prod", instance="otds-olap-f1.internal-08.example.org", job="yyy-servers-prod", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Prod", instance="otds-olap-f2.internal-08.example.org", job="yyy-servers-prod", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Test", instance="otds-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="SystemCenter", env="Test", instance="otsc-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcentermanager", state="stopped"}
0
You can get list of services, that are not in state running
or stopped
with the following query:
windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state!~"running|stopped"} == 1
added bonus: label state
will show actual state of the service.