Search code examples
celeryscheduled-tasks

What's the point of having a single celery worker with multiple queues?


continuing How does a Celery worker consuming from multiple queues decide which to consume from first?

I've setup a single worker and have it listen to two queues. I understand from the above linked question that the worker would consume messages from those two queues in round-robin or in the order they arrived (depending on celery version).

So what's the purpose of this setting? Why is it different than a single queue? Would that be helpful only for monitoring, or is there an operational benefit i'm missing here?


Solution

  • In most scenarios you will have your worker subscribed only to a single queue, however there are scenarios when having ability to subscribe to multiple queues makes sense.

    Here is one. Imagine you have a Celery cluster of 10 machines. They perform various tasks, and among them there is a task that downloads files from remote file-server. However, the owner of the file-server whitelisted only two of your 10 machine IPs, so basically only two of them can download files from that particular file-server. Typically you will have Celery workers on these two machines subcribe to an additional queue, called "download" for an example, and schedule download tasks by sending them to the "download" queue.

    This is a very common scenario where a subset of your nodes can do particular thing (access remote servers - file servers, database servers, etc).

    One could argue "why not have just the 'download' queue on these two machines?" - that may be a waste of resources.