Celery with Redis vs Redis Alone

I am having trouble understanding what the advantage of using Celery is. I realize you can use Celery with Redis, RabbitMQ etc, but why wouldn't I just get the client for those message queue services directly rather than sitting Celery in front of it?

Solution

The advantage of using Celery is that we mainly need to write the task processing code and handling of task delivery delivery to the task processors is taken care of by the Celery framework. Scaling out task processing is also easy by just running more Celery workers with higher concurrency (more of processing threads/processes). We don't even need to write code for submitting tasks to queues and consuming tasksfrom the queues. Also, it has built in facility to add/removing consumers for any of the task queues. The framework supports retry of tasks, failure handling, results accumulating etc. It has many many features which helps us to concentrate on implementing the task processing logic only.

Just for an analogy, implementing a map-reduce program to run on Hadoop is not a very complex task. If data is small, we can write a simple Python script to implement the map-reduce logic which will outperform a Hadoop map-reduce Job processing the same data. But when data is very huge, we have to divide the data across machines, we will need to run multiple processes across machines and co-ordinate their executions. The complexity lies in running multiple instances of mappers and then reducers tasks across multiple machines, collecting inputs and distributing the inputs to mappers, transferring the outputs of mappers to appropriate reducers, monitoring progress, relaunching failed tasks, detecting job completion etc. But because we have Hadoop, we don't need to care much about the underlying complexity of executing a distribute job. Same way Celery also helps us to concentrate mainly on task execution logic.