Search code examples
sidekiqpriority-queue

Sidekiq: Does assigning weights lead to priority queue semantics?


In my Sidekiq Pro setup I have ~10 queues with different weights. My intention is to use the weights to indicate priorities of the queues.

I had an incident the other day that caused the spawning of a huge amount of low weight jobs from a batch, and there seemed to be a contention effect on the high weight jobs as well.

In this section of the documentation I read the following:

Each queue can be configured with an optional weight. A queue with a weight of 2 will be checked twice as often as a queue with a weight of 1

This has me slightly confused. Can I expect weights to produce priority queue semantics?


Solution

  • While I agree with most of Erik's answer I don't think the conclusion quoted below is true:

    when considering large numbers of queued jobs (due to the stochastic nature of the queue selection) priority queue semantics are achieved using weights

    I'm not sure whether we disagree about the way Sidekiq works or about what "priority queue semantics" would mean in this context. I'll use Wikipedia's definition of priority queue:

    Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority.

    My interpretation of this is that the priority would be the same as the weight, and the bigger the weight is, the higher the priority. This should mean that as long as there are jobs in the queue with the highest weight, they should be served before jobs on a queue with a lower weight. This is not the case in Sidekiq when weights are used. It is the case for queues defined in an order without weights, then the order of the queue is simply the priority.

    Can I expect weights to produce priority queue semantics?

    To convince yourself that weighted queues does not lead to priority semantics, just consider the following counter example. There are two queues, both with a backlog. Queue A has weight 9 and queue B has weight 1. When Sidekiq fetches the next job from weighted queues - a queue is picked randomly with the probability weight/total weight, so there's a 1/10 chance each time a new job is picked that a job from queue B is picked.

    This is not priority queue semantics, since jobs with much lower priority from queue B still run ahead of jobs on queue A 10% of the time.

    Not only that, but thinking that it's "90% priority queue semantics" based on the previous example is also probably a mistake. For example, it does not mean that 90% of your available resources will be allocated to queue A from the previous example. Yes, 90% of the throughput will be jobs from queue A, but if jobs on queue B take 10 times longer than jobs on queue A - over half (~53%) of your Sidekiq threads will actually be processing lower priority jobs from queue B.

    Thinking about weighted queues in terms of throughput allocation seems to be the most correct way to get an intuition for this. This means:

    • Weights correspond to the fraction of throughput (when no queues are empty) allocated to each queue.
    • It does not mean that you have allocated a proportion of CPU time or threads to a certain queue.
    • Big batches of low priority jobs can, if they have higher latency than high priority jobs, decrease the overall throughput and affect other queues a disproportionate amount.
    • Assigning a high weight to a latency-sensitive queue will not be enough to guarantee a high throughput for the queue if jobs on other queues are slow.
    • If you want priority semantics, use unweighted queues.
    • If you want priority semantics and guarantee that jobs on all queues are being processed you need to do some serious capacity planning. For incidents, you might also need to make sure that jobs are throttled when there are unexpected surges in enqueued jobs, or that there are circuit breakers if jobs start timing out.