Search code examples
promql

Promql percent successes by category


I have a gauge that with label "success_or_failure" with values "success" and "failure". I want to create a chart in graphana that shows the percentage of success broken down by another label "category". The idea is to see the "percent success" for each category over time.

The promql I want would be something like:

avg by (category) 
  (event_gauge{success_or_failure="success"} / 
  ignoring(success_or_failue) group_right event_gauge)

When I do this, every charted element is equal to 1. This leads me to believe think that I'm doing it wrong. What am I doing wrong here?


Solution

  • Prometheus will only use your last stored value in the PromQL query you have listed. The PromQL query always works based on the last value of your metric.

    You could do a

    sum_over_time(event_gauge{success_or_failure="success"}[1h]) / 
    count_over_time(event_gauge{success_or_failure="success"}[1h])
    

    to sort of get what you want.

    But really, the data is not modelled well for using with Prometheus. You should think of using counters for this.

    Something with event_counter{category="xyz", status="success"} and event_counter{category="xyz", status="failure"}. And have your instrumentation code increment the current counter based on the events.

    Let Prometheus just scrape the latest state of your counters.

    Then you can do

    avg by (category) 
     (sum without (status) (event_counter{category="xyz", status="success"}) 
        / 
      sum without (status) (event_counter{category="xyz"}))
    

    P.S. I haven't fully tested it so there might be slight syntax mistakes, but most importantly do see if you can model it as a counter instead of a gauge.