Search code examples
node.jsredisnode-redisbullbull-queue

Bulls Queue Performance and Scalability: Queue.add(), Queue.getJob(jobId), Job.remove()


My use case is to create dynamic delayed job. (I am Using Bulls Queue which can be used to create delayed Jobs.)

Based on some event add some more delay to the delayed interval (further delay the job).

Since I could not find any function to update the Delayed Interval for a Job I came up with the following steps:

onEvent(jobId):
  // queue is of Type Bull.Queue
  // job is of type bull.Job
  job = queue.getJob(jobId)
  data = job.data
  delay = job.toJSON().delay
  job.remove()
  queue.add("jobName", {value: 1}, {jobId: jobId, delayed: delay + someValue})

This pretty much solves my problem.

But I am worried about the SCALE at which these operations will happen.

I am expecting nearly 50K events per minute or even more in near future.

My Queue size is expected to grow based on unique JobId.

I am expecting more than:

  • 1 million daily entry
  • around 4-5 million weekly entry
  • 10-12 million monthly entry.

Also, after 60-70 days delayed interval for jobs will reach, and older jobs will be removed one by one.

I can run multiple processor to handle these delayed job which is not an issue.

My queue size will be stabilise after 60-70 days and more or less my queue will have around 10 million jobs.

I can vertically scale my REDIS as required.

But I want to understand the time complexity for below queries:

 queue.getJob(jobId)  // Get Job By Id
 job.remove() // remove job from queue
 queue.add(name, data, opts) // add a delayed job to this queue

If any of these operations are O(N) OR the QUEUE can keep some max number of Jobs which is less than 10 million.

Then I might have to discard this design and come up with something entirely different.

Need advice from experienced folks who can guide me on how solve this problem.

Any kind of help is appreciated.


Solution

  • Taking reference from the source code:

    queue.getJob(jobId)

    This should be O(1) since it's mostly using hash based solutions using hmget. You're only requesting one job and according to official redis docs, the time complexity is O(N) where N is the requested number of keys which will be in the order of O(1) since I'm expecting bull is storing few number of fields at the hash key.

    job.remove()

    Considering that a considerable number of your jobs is going to be delayed and a fraction of them are moved to waiting or active queue. This should be O(logN) on an amortized level as it's mostly using zrem for these operations.

    queue.add(name, data, opts)

    For job addition in a delayed queue, bull is using zadd so this is again O(logN).