I have a PromQL query in Grafana that returns the CPU usage of all Namespaces in my Kubernetes cluster, aggregated by a Namespace label called risk
:
sort_desc(
sum(
max(kube_namespace_labels{label_risk="$risk_query"}) by (label_risk, namespace)
*
on(namespace) group_right(label_risk)
sum by (namespace)
(avg_over_time(namespace:container_cpu_usage:sum{}[$__range]))) BY (label_risk))
As you can see, the query is filtered by the Grafana query variable risk_query
, which is defined like this:
label_values(kube_namespace_labels, label_risk)
To an extent, this query functions as intended: The dashboard's user can select any existing risk
Namespace label value in the drop-down menu, and they will see the CPU usage of all Namespaces that have the risk
label set to that particular value.
The problem is that the user must also have a working "All" option for this variable, such that the query returns CPU usage of all Namespaces, even the ones that don't have the risk
label defined. Unfortunately, using the "Include All option" for the risk_query
variable returns "No data", whether we use the Custom all value of <blank>
or ""
. The only way I can figure how to include CPU usage of all Namespaces is by completely removing {label_risk="$risk_query"}
from the query, but that makes filtering impossible.
In summary, how can I keep the feature of filtering this query but also allow the user to see CPU usage of all Namespaces when desired?
Thanks to @valyala for pointing me in the right direction to solve this problem.
Firstly, when defining the risk_query
query variable in Grafana, use .*
as the "Custom all value".
Second, use {label_risk=~"$risk_query"}
instead of {label_risk="$risk_query"}
:
sort_desc(
sum(
max(kube_namespace_labels{label_risk=~"$risk_query"}) by (label_risk, namespace)
*
on(namespace) group_right(label_risk)
sum by (namespace)
(avg_over_time(namespace:container_cpu_usage:sum{}[$__range]))) BY (label_risk))
The Prometheus documentation explains why this expression works even for Namespaces that do not have label_risk
defined:
Label matchers that match empty label values also select all time series that do not have the specific label set at all.