Search code examples
prometheus

Windows_exporter: static list in left part "ignoring(state) (windows_service_state{...state="stopped"} == 0)" instead of dynamic output


I have a windows_exporter query to monitor certain windows services.

(windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="running"} == 0) - ignoring(state) (windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="stopped"} == 0)

We have a lot of servers and I don't want to hardcode them, so I made the second part. In the second part are the services that need to be stopped. Unfortunately this is not possible, when you kill the service with Task Manager, it also goes into the stopped state.

ignoring(state) (windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state="stopped"} == 0)

Now I have to do the second part of the query static (hardcoded), i.e. enter all servers where stopped is ok.

How-To to do this? Provide a list of servers and windows services where the stopped state is fine.

Here is the output from Prom Console

windows_service_state{app="xxx", component="MercoServerAdmin", env="Prod", instance="otcs-olap-adm1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerAdmin", env="Prod", instance="otcs-olap-adm1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Int", instance="otcs-int-f1.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Int", instance="otcs-int-f2.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f2.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Prod", instance="otcs-olap-f3.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerFrontend", env="Test", instance="otcs-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcenteragent", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndex", env="Prod", instance="otcs-olap-idx1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndex", env="Prod", instance="otcs-olap-idx1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdmin", env="Int", instance="otcs-int-idxad1.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdmin", env="Int", instance="otcs-int-idxad1.internal-08.example.org", job="yyy-servers-int", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerIndexAdminSearch", env="Test", instance="otcs-ooriom-b1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcenteragent", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Int", instance="otcs-int-src.internal-08.example.org", job="yyy-servers-int", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Int", instance="otcs-int-src.internal-08.example.org", job="yyy-servers-int", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src1.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src1.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src2.internal-08.example.org", job="yyy-servers-prod", name="otcs", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="MercoServerSearch", env="Prod", instance="otcs-olap-src2.internal-08.example.org", job="yyy-servers-prod", name="otcsadmin", state="stopped", webserver="IIS"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Int", instance="otds-int-f1.internal-08.example.org", job="yyy-servers-int", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Prod", instance="otds-olap-f1.internal-08.example.org", job="yyy-servers-prod", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Prod", instance="otds-olap-f2.internal-08.example.org", job="yyy-servers-prod", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="DirectoryServices", env="Test", instance="otds-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="tomcat10", state="stopped"}
0
windows_service_state{app="xxx", component="SystemCenter", env="Test", instance="otsc-ooriom-f1.internal-21.example.org", job="yyy-servers-ooriom", name="otsystemcentermanager", state="stopped"}
0



Solution

  • You can get list of services, that are not in state running or stopped with the following query:

    windows_service_state{app="xxx", name=~"tomcat.+|ot.+", state!~"running|stopped"} == 1
    

    added bonus: label state will show actual state of the service.