Search code examples
elasticsearchredisrabbitmqlogstash

Redis Vs RabbitMQ as a data broker/messaging system in between Logstash and elasticsearch


We are defining an architecture to collect log information by Logstash shippers which are installed in various machines and index the data in one elasticsearch server centrally and use Kibana as the graphical layer. We need a reliable messaging system in between Logstash shippers and elasticsearch to grantee the delivery. What factors should be considered when selecting Redis over RabbitMQ as a data broker/messaging system in between Logstash shippers and the elasticsearch or vice versa?


Solution

  • After evaluating both Redis and RabbitMQ I chose RabbitMQ as our broker for the following reasons:

    1. RabbitMQ allows you to use a built in layer of security by using SSL certificates to encrypt the data that you are sending to the broker and it means that no one will sniff your data and have access to your vital organizational data.
    2. RabbitMQ is a very stable product that can handle large amounts of events per seconds and many connections without being the bottle neck.

    Regarding scaling, RabbitMQ has a built in cluster implementation that you can use in addition to a load balancer in order to implement a redundant broker environment.

    Is my RabbitMQ cluster Active Active or Active Passive?

    Now to the weaker point of using RabbitMQ:

    1. most Logstash shippers do not support RabbitMQ but on the other hand, the best one, named Beaver, has an implementation that will send data to RabbitMQ without a problem.
    2. The implementation that Beaver has with RabbitMQ in its current version is a little slow on performance (for my purposes) and was not able to handle the rate of 3000 events/sec from one server and from time to time the service crashed.
    3. Right now I am working on a fix that will solve the performance problem for RabbitMQ and make the Beaver shipper more stable. The first solution is to add more processes that can run simultaneously and will give the shipper more power. The second solution is to change Beaver to send data to RabbitMQ asynchronously which theoretically should be much faster. I hope that I’ll finish implementing both solutions by the end of this week.

    You can follow the issue here: https://github.com/josegonzalez/python-beaver/issues/323

    And check the pull request here: https://github.com/josegonzalez/python-beaver/pull/324

    If you have more questions feel free to leave a comment.