Search code examples
zabbix

Zabbix ring topology trigger dependencies


I have a question according redundant network topologies. I want to track ICMP availability of each Switch in my network. For every Switch I'm using a template that has item icmpping and trigger that tracks last value of this item (ICMP trigger, actually this is Template ICMP Ping). Used Zabbix version is 4.2.8.

Let's say I have linear topology where each device connected to another in straight line: Switch C <-> Switch B <-> Switch A <-> Aggregation. In such linear topology trigger dependencies are pretty much obvious: Switch C's availability depends on state of Switch B, Switch B's availability depends on Switch A state, and Switch A depends on availability of Aggregation device. It's not a problem to make trigger dependencies for this.

But now I have a ring topology: Switch C is connected to both Switch A and Switch B, Switch A and Switch B are connected to Aggregation device thus making a ring C <-> A <-> Aggr <-> B <-> C. In this case I could add two dependencies for both Switch A and B availability triggers in Switch C ICMP trigger configuration. But if one of uplink switches (A or B) fail I still would not know if C is down or up: Switch C trigger would be suppressed since at least one parent trigger is fired:

Before changing the status of the 'Host is down' trigger, Zabbix will check for corresponding trigger dependencies. If found, and one of those triggers is in 'Problem' state, then the trigger status will not be changed and thus actions will not be executed and notifications will not be sent.

I can imagine several options to do so.

Option 1: As a workaround I can manually change ICMP trigger to track both this device (Switch C) and two uplinks' icmpping item value in a single trigger like this:

{Switch_C:icmpping[{HOST.IP}].last()}=0 and ({Switch_A:icmpping[{HOST.IP}].last()}<>0 or {Switch_B:icmpping[{HOST.IP}].last()}<>0)

But since I'm using same templates for all the devices (those which are part of linear topology and those which are in a ring topology), doing so would require to add non-template trigger to every 'ring' device which is quite a lot of work.

Option 2: I can monitor interfaces status on A and B that are connected to C, but that is actually even more work than the previous option because I need to add interface state item to every switch.

Is there a better way to perform correct monitoring for devices in such ring topologies?


Solution

  • The short answer is "you can't", but I'll try to expand it.

    First thing to consider, you are checking for the reachability of your switch ip addresses (probably assigned to specific VLAN interfaces) from the zabbix server point of view. Is it connected to switch A, B or C ?

    Then, modern spanning tree algs treats VLANs with multiple instances, so the vlan dedicated to your switches ip can have a blocked port between B and C while your production vlan could be blocking from A to B. You have multiple logical topologies within a single physical topology.

    There is no network topology auto-mapper, it has been requested from time to time but nobody created it: you have to implement it on your own.

    You can for instance query the switches with a custom script to:

    • get the LLDP or CDP neighbour status from the switches to build the current topology
    • dynamically create screens and maps with api calls
    • dynamically set up and delete dependancies api calls
    • react to changes to the toopology

    A simpler way could be:

    • set up icmp check to your switch ip addresses with alert
    • set up snmp check for spanning tree recalculation with alert (see Template SNMP Switch:stpLastTimeChange)
    • install Netdisco to draw the topology