Search code examples
prometheusgrafana

How can i create a deviations query?


I wanted to write a query that would allow me to calculate deviations by the number of created orders.

Task: the query should look back 7 days and based on this data build a minimum allowable threshold (MAT). If the number of orders for a minimum period of time (5 minutes) is less than MAT, then an alert will be generated.

Features: The number of orders directly affects the time of day and seasonality.

Having searched the Internet, I found information about so-called Poisson distribution, and tried to apply it to the problem, but it didn't work.

In prometheus there are such functions as day_of_week(), avg_over_time() and stddev_over_time.

From what I was able to do:

  1. The difference between the number of orders in the last 5 min. sum(delta(my_search_counter{service_name="car.book.v1"}[5m])
  2. Five-minute average time variation over the last 30 minutes with a resolution of 5 minutes avg_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])
  3. Mean square deviation: stddev_over_time(sum(delta(my_search_counter{service_name="car.book.v1"}[5m]))[1w:5m])

This is where I'm stuck and can't figure out how to build a proper query. Maybe there is another way, simpler, but I haven't found it.

I tried to combine these queries with each other using addition, subtraction and division.


Solution

  • I'm not sure what statistics is this, and how adequate this is as a threshold, but here is query you described.

    sum(increase(my_search_counter{service_name="car.book.v1"}[5m]))
    < sum(increase(my_search_counter{service_name="car.book.v1"}[5m] offset 1w))
      - stddev_over_time(sum(increase(my_search_counter{service_name="car.book.v1"}[5m] offset 1w))[1d:5m])
    

    It returns value if number of order over last 5 minutes is less then number of orders over same 5 minutes 1 week ago minus standard deviation of orders number over 24 hours presiding current moment 1 week ago.

    You might need to play a little with multiplier for stddev part, to get a reasonable percent of alerts.