Search code examples
prometheuspromql

How to count over threshold metrics from range vector in PromQL (Prometheus)


I defined latency metrics it can query as scalar like below:

latency{name="Controller/products/show",percentiles="95"}   0.9935112

Then, I did the query. Output is a range vector.

latency{name="Controller/products/show",percentiles="95"}[10m]

output:

element:
latency{name="Controller/products/show",percentiles="95"}

value:
0.9429009 @1584497778.164
0.9150374 @1584497838.164
0.9085548 @1584497898.164
0.9006939 @1584497958.164
0.9390876 @1584498018.164
0.9593425 @1584498138.164
0.96289706 @1584498198.164
0.98113775 @1584498258.164
0.9935112 @1584498318.164

I want to count over 0.95 values from vector range values.

For example, get 4 from above vector range values.

Anyone have solutions?


Solution

  • Prometheus subquery can be used for this task:

    count_over_time((latency{name="Controller/products/show",percentiles="95"} > 0.95)[10m:50s])
    

    Note that the step value after the colon (50s in the example above) must be smaller than the scrape interval for the selected metric, since Prometheus evaluates the query inside parenthesis at regular points with the configured step interval between them.

    Update: This task also can be solved without subqueries when using count_gt_over_time() function from MetricsQL. For example, the following query returns the number of raw samples with values exceeding 0.95 over the last 10 minutes:

    count_gt_over_time(latency{name="Controller/products/show",percentiles="95"}, 0.95)