Search code examples
clojureriemann

Riemann - trigger resolve based on metric threshold


I am trying to set up an alert in riemann (through pagerduty) based on a threshold for a metric. If the threshold is breached the alert should be triggered, if the metric goes back within the threshold the alert should be resolved.

My steps are: 1) Create an event with state "warning" if threshold is breached 2) Create an event with state "ok" if threshold is not breached

My code for this looks like -

(let [index (default :ttl 120 (index))]
   (streams
      index
      (where (service #"test")
         (where (>= metric 100)
            (smap (fn [e]
                    (event {:service (:service e) :metric (:metric e) 
                            :state "warning" }) 
               index))))

(I have only shown the relevant bits of code)

I see that this code does not create a new event if threshold is breached.

I am not sure if I am making a mistake. Any help would be appreciated.

Regards,

Sathya


Solution

  • It sounds like you have two questions:

    • why if the event not getting into the index when the metric is greater than 100
    • where should I put the calls to create and resolve the PD alerts.

    As to the first one, your code looks correct, it should be indexing the event. You may want to put a :ttl in there so the events expire at the corect times. and :host key as well for good measure. In general it looks like the with function will accomplish the same thing more easily

    For the second question a rough outline looks something like this:

    (let [index (default :ttl 120 (index))]
       (streams
          index
          (where (service #"test")
             (where (>= metric 100)
               (with :state "warning"
                 (rollup 2 3600 (create-pd-alert-here))))
             (where (< metric 50)
               (with :start "warning"
                 (resolve-pd-alert-here)))