Search code examples
zabbix

How to teach Zabbix to be smart about short spikes in events?


Recently I've started to receive alerts from Zabbix about high iowait at one of our servers. This event is caused by pg_dump backing up our database and it's perfectly ok because the spike is short and... well, because backing up is a legitimate activity at the server.

Is there a way to teach Zabbix to be smart about such things? I think there is no need to send email alerts about one short (<30secs) spike of iowait per day. On the other hand, if server day is full of 5-secs spikes, this should be investigated.


Solution

  • The most popular and easiest way is to use proper trigger functions. For example, instead of what you most likely use, last(), go for min(60) or avg(60). Minimum would not detect repeated spiking, average might.

    Other potentially useful trigger functions for this purpose:

    • regexp() (mostly for text items)
    • str() (mostly for text items)
    • count()

    Note that last() cannot return or evaluate multiple values, thus last() = last(0) = last(300).