Im running some tasks via django-celery (with rabbitmq as backend), the tasks are time consuming and cpu intensive.
I got 2 worker Ec2 instances (One is small and other is high cpu medium).
Ive set the small instance to run 1 concurrent task, and the medium to do 4. This works well for me. But occasionally, in the celery monitor, I see that the small instance is working on a task and 2 or 3 more tasks are in "RECEIVED" state(assigned to the small instance), while the medium instance is not doing anything. Ideally id like the medium instance to have preference over the small, but in this case if small is at its concurrency the task should goto the medium. It seems the small instance is biting more than it can chew.. as in allocating tasks to itself which it cant start at the moment.
Is there a way to make workers accept only the tasks it can start at that moment?
Screenshot : http://dl.dropbox.com/u/361747/task-state.png . The worker starting with domU is the small, the one starting with ip is medium.
You can use CELERYD_PREFETCH_MULTIPLIER option to control how many tasks to prefetch. In your case CELERYD_PREFETCH_MULTIPLIER=1 will help evenly distribute tasks.
http://ask.github.com/celery/configuration.html#celeryd-prefetch-multiplier