Search code examples
kubernetesprometheusgrafanapromql

Counting sum of only distinct values while displaying other columns


I'm trying to create an alert using promql/prometheus but having trouble generating the proper time series. In K8s, my objective is to display any code/version mismatch for a particular app sitting in multiple clusters. By using count, any app with a value greater than 1 would tell me that there are > 1 version deployed. Currently my query will generate the following:

count(kube_deployment_labels{label_app=~".*"}) by (label_version, label_app)

| label_app |  label_version  |  Value #A |
-------------------------------------------
| app_1     |   0.0.111       |     2     |
| app_1     |   0.0.222       |     1     |
| app_2     |   0.0.111       |     2     |
| app_2     |   0.0.222       |     1     |
| app_3     |   0.0.111       |     3     |

The values on the 4th column represent the # of said code (label_version) deployed in each cluster; for example, app_1 & app_2 has version 0.0.111 deployed in two clusters & 0.0.222 deployed in one cluster, but app_3 has the same version deployed in all three clusters.

My end goal is to only count distinct label_version and have the time series populate in this way:

| label_app |  Value #A  |
 -------------------------
| app_1     |      2     |
| app_2     |      2     |
| app_3     |      1     |

Executing (count(group by(label_version)kube_deployment_labels{label_app=~".*"})) gives me the correct Value but I'd like to list out all the apps associated to the Value as well. I've tried a variety of grouping but was unsuccessful.

Any help would be appreciated! Thanks in advance.


Solution

  • You can apply one more count by over your previous result.

    count(
     count(
      kube_deployment_labels{label_app=~".*"}
     ) by (label_version, label_app)
    ) by (label_app)
    

    will return number of label_versions associated with each label_app.

    Somewhat similar query can be seen in this demo.