Search code examples
zabbix

same trigger alert stop in zabbix


I'm using Zabbix 3.2; I've configured mail alert Action for all triggers. My question is, Say trigger(A) alerts (Problem event) on escalation and returns to normal (Ok event alert) after few mintues. I need to stop the alert if same 'A' Trigger happened in next few mintues. How can it be possible? I've tried with this documentation; https://www.zabbix.com/documentation/3.2/manual/config/notifications/action/escalations


Solution

  • The question seems to be about preventing trigger flapping. In general, three methods are suggested:

    • use trigger functions - for example, instead of last() use avg(15m) - then the alerting will happen only after the average value for 15 minutes has exceeded the threshold. Other useful trigger functions might be min() and max()
    • use hysteresis - this makes trigger fire at one threshold but resolve on another. Before Zabbix 3.2 that was done in the trigger expression; since Zabbix 3.2 there is a separate "recovery" field
    • use action escalations that do nothing at first, and only send an alert when the problem has been there for some period of time - for example, sending out the alert on the second or third step

    All three methods achieve a similar outcome, but the key differences are:

    • the first method - trigger functions - makes the trigger fire later, but reduces the number of events (the times trigger fires)
    • the second method - hysteresis - makes the trigger fire at the same time as the "flappy" trigger, but delays the recovery event. It also reduces the number of events (the times trigger fires)
    • the third method - delayed escalation steps - does not affect the trigger at all, it can keep on flapping. It will only alert if the problem is there for a longer time, though.

    Hysteresis will usually alert when a trigger would have flapped; delayed escalation steps will ignore short-lived problems.

    Complexity-wise, I'd usually go with the first method - it is the easiest to configure, the hardest to misconfigure and the easiest to understand. I'd go with one of the two other methods if I specifically needed the way they make events/alerts behave - those methods have a bit higher potential to be misconfigured or misunderstood.

    Note that the item key reference in the comment is wrong - host is separated from key with colon, full key name is missing and the parameter is wrong. See the agent key page in the manual for correct key syntax.