Search code examples
beanstalkd

beanstalkd: How workers can watch/prioritize tubes/queues in certain order?


The closest thing I can find already is this question and answer but it does not solve my problem: Do priorities work across queues/tubes in Beanstalkd?

I'm thinking if I can tell my workers to watch tubes in a certain order or give priority too tubes it could solve my problem but maybe someone else has a better solution.

What I need, or would like to do is have tubes for users 1-5. example tubes called: user1,user2,user3,user4,user5

I have 5 worker processes to process jobs watching each tube.

The challenge is that if only 1 user has jobs submitted, I want the other 4 processes to help out and complete the jobs faster.

But if all 5 users have jobs in their tube, I want to give priority so that each users jobs are being processed and making progress to completion.

What I do not want is user1 to be hogging all the processing power especially in the case his tube has 1000s of jobs in it where user2,user3,user4 & user5 only have 5-10 jobs can could be finished very quickly.

If I tell all the workers to watch all tubes (user1-5) the FIFO rule kicks in so the user that submitted 10000 jobs first all get priority while the users with just a few jobs submitted later all wait for the big jobs to finish.

What I would also like is that when processes do not sense any jobs waiting on their main/prioritized tube (user3 for example), user3 would default back to helping with any other userX tubes that still have jobs pending.

I'm not sure of the best way to solve this yet but my partial solution here is also an attempt to explain the issue in the first place.

The jobs users submit could have 1000 actionable items/jobs. So I am thinking to break up these 1000 jobs submit batches or 10 or 100 within each user tube. Not sure that would help either.

Another potential solution would be to dynamically tell the worker to "watch" or "ignore" tubes but that adds another level of complexity I’m not sure would or and hope would not be necessary.

Any suggestions on how to solution this would be greatly appreciated.


Solution

    1. If you are in control of the submit method, you should submit batches of 10 or 100 by setting lower priority after these increments. That's the easiest to do here, and quickly solves your problem.

    2. Having an orchestrator for your tubes to watch/ignore based on some input is also doable, and once this is built you can build with pluggable in mind, and can reuse with different algorithms/logic. For example users who are on free plan could be picked later than who are on paid plan. This is not as complicate as you imagine. You just need to build a script that based on a set of input parameters decides the flow by adding or subtracting workers from your tubes. It could come handy in lots of situation. If you will work lot with messages queues this is the way to go, and worth the time.

    3. Another one to explore is to prepare your workers to cross watch multiple tubes. Like worker1-instance1 should watch tube user1,user2,user3, then the same worker1-instance2 should watch user2,user3,user4, worker1-instance3 user3,user4,user5 and so on. This could also be built dynamically in the orchestrator I mentioned.

    4. Break the task in multiple smaller ones. Not sure how long your current jobs take, but processing 1000 messages should not cause a performance impact on other users. Make sure tasks take no more than 50ms. Having this then by the time the 2nd user submits some jobs, a lot from the first one are already processed, and you are will not see you are capped by the FIFO rule. We usually have in our projects 1 task breakdown into 10 different hops that are different tubes/tasks. We seen this makes better the whole throughput and doesn't affect full roundtrip.