In my Sidekiq Pro setup I have ~10 queues with different weights. My intention is to use the weights to indicate priorities of the queues.
I had an incident the other day that caused the spawning of a huge amount of low weight jobs from a batch, and there seemed to be a contention effect on the high weight jobs as well.
In this section of the documentation I read the following:
Each queue can be configured with an optional weight. A queue with a weight of 2 will be checked twice as often as a queue with a weight of 1
This has me slightly confused. Can I expect weights to produce priority queue semantics?
While I agree with most of Erik's answer I don't think the conclusion quoted below is true:
when considering large numbers of queued jobs (due to the stochastic nature of the queue selection) priority queue semantics are achieved using weights
I'm not sure whether we disagree about the way Sidekiq works or about what "priority queue semantics" would mean in this context. I'll use Wikipedia's definition of priority queue:
Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority.
My interpretation of this is that the priority would be the same as the weight, and the bigger the weight is, the higher the priority. This should mean that as long as there are jobs in the queue with the highest weight, they should be served before jobs on a queue with a lower weight. This is not the case in Sidekiq when weights are used. It is the case for queues defined in an order without weights, then the order of the queue is simply the priority.
Can I expect weights to produce priority queue semantics?
To convince yourself that weighted queues does not lead to priority semantics, just consider the following counter example. There are two queues, both with a backlog. Queue A has weight 9 and queue B has weight 1. When Sidekiq fetches the next job from weighted queues - a queue is picked randomly with the probability weight/total weight
, so there's a 1/10 chance each time a new job is picked that a job from queue B is picked.
This is not priority queue semantics, since jobs with much lower priority from queue B still run ahead of jobs on queue A 10% of the time.
Not only that, but thinking that it's "90% priority queue semantics" based on the previous example is also probably a mistake. For example, it does not mean that 90% of your available resources will be allocated to queue A from the previous example. Yes, 90% of the throughput will be jobs from queue A, but if jobs on queue B take 10 times longer than jobs on queue A - over half (~53%) of your Sidekiq threads will actually be processing lower priority jobs from queue B.
Thinking about weighted queues in terms of throughput allocation seems to be the most correct way to get an intuition for this. This means: