Search code examples
monitoringzabbix

Please suggest a good Monitoring and Alerting tool for applications hosted in cloud


I am looking for a monitoring and alerting tool for my application hosted in cloud. My application is hosted across multiple servers and I want to monitor all these servers. I am interested in monitoring the following:

1. Service monitoring:

  • Check if the service is up. This requires
    • try siging-up a new user
    • log-in to the application with given username/password and perform certain steps like search etc.
  • Monitoring QoS. How much time is it taking for searches and some other opertions

2. resource monitoring Monitoring the following parameters in each server:

  • CPU utilization
  • load average
  • Memory usage
  • Disk usage
  • IOPS

3. process monitoring

Monitor if a set of processes are running or not. If not running try restarting them. Ex: php-fpm, my application binaries, mysql, nginx, smtp etc.

4. Monitoring log files

  • Error logs of my application
  • mysql error log
  • MySQL slow query log etc.

Also I should be able to extend its usage by executing shell commands or writing my own shell scripts.

I should be able to set alert if any monitored item is found problematic. I should be able to get alert through

  • email
  • Mobile SMS

The Monitoring system should maintain history for the period I want. So that after receiving the alert I should be able to log-in to the system and view past data (say past 2 weeks) and investigate problems.

Most important:

The tool should have a very good way of managing its own configuration.

  • The configuration should not be scattered at multiple places. All configuration should be stored in a centralized place. In future say, path of a monitored log file has changed. I would like to search and replace all occurrences of that file in my configuration.
  • I should be able to version control my configurations.
  • Instead of going to the web interface and setting configuration manually, I would like set up a script which automatically loads all the configurations and start monitoring.

I am exploring Zabbix but don't see a satisfactory way of configuration management. Should I try Nagios? Any other tool?


Solution

  • Nagios is one of the standard ways of monitoring and can support all the use cases you brought up (plus, plugins have probably already been written for all of them).