Search code examples
monitoringrsynczabbix

How can I monitor failed rsync job with Zabbix?


I have a situation where I need to monitor (with Zabbix) if a rsync job failed to execute.

I though about inserting an exit code on a file at the source and monitoring that but I haven't found a good way of doing this.

Anyone have any idea of a method I can perform this monitoring?


Solution

  • I solved this doing 3 thing.

    1 - Create a script to execute the rsync on cron

    #!/bin/bash +x
    # Put your own rsync command on line below 
    rsync -rlptv --delete-after root@serverA:/some_dir/ /another_dir/ > /lalla_dir/my.log
    
    # Check if rsync was executed with success
    if [ $? = 0 ];then
    # If true, send a random number to log file and status=ok message
    echo $[ 1 + $[ RANDOM % 1000 ]] >> /lalla_dir/my.log
    echo "Status = OK" >> /lalla_dir/my.log
    # If false, send a random number to log file and status=ERROR message
    else
    echo $[ 1 + $[ RANDOM % 1000 ]] >> /lalla_dir/my.log
    echo "Status = ERROR" >> /lalla_dir/my.log
    fi
    

    2 - Create two Itens on Zabbix

    A - Check the check_sum of my.log (that was the reason of why the script must have the Random number, that way you are sure that the log file has been modified since the last check

    Zabbix key

    vfs.file.cksum[]
    

    B - Check the log file for the OK message.

    Zabbix key

    vfs.file.regmatch[/lalla_dir/my.log,Status = OK]
    

    3 - Create the trigger.

    {my-server:vfs.file.cksum[/lalla_dir/my.log].change()}=0
    or
    {my-server:vfs.file.regmatch[/lalla_dir/my.log,Status = OK].last()}=0
    

    So, if your log file don't changed or don't show the "Status = OK" message, means they was executed with erro (failed) or it does not run (cron problem maybe)

    Sorry for the bad english - use of has, have, they ... still leaves me confused