Data locality via many queues in Celery?

We're trying to design a distributed pipeline that crunches large numbers of data chunks in a parallel fashion. We're moving towards adopting celery, but one of the requirements is that we need to be able to map certain jobs to certain nodes in the cluster, e.g. if only one node has access to a certain data chunk.

The first answer that comes to mind is multiple queues, potentially even one queue per node, for a large (~64) number of nodes. Is this feasible, and efficient? Are celery queues lightweight? Is there a better way?

Solution

The best answer I've found to date is here:

Is Celery appropriate for use with many small, distributed systems?

Which suggests that Celery is indeed a good fit for this use case. Perhaps I'll update again when we've implemented.