Search code examples
herokunestjsweb-workertask-queuebull

How to Use Heroku Background Workers with NestJS and Bull?


What is the recommended way of providing Heroku workers for heavy processes that I want running on my queue using NestJS?

I have an HTTP server running on Heroku that executes certain time-consuming tasks (e.g. communicating with certain third-party APIs) that I want to be put in a Queue and have delegated to background workers.

Reading this example, it seems that I would create a processor file and instantiate the Queue object there and then define it's process function. That seems to allow scaling up, because each process would have the Queue object and define it's process therein. Spinning up more dynos would provide more workers.

Looking over here, I see that I can declare the process file when I register the queue. There I do not need to instantiate the Queue object and define it's process, I can simply export a default function. Can I declare a worker process in my Procfile that points to one of these process files and scale them up? Will that work? Or am I missing something here?

Right now, I don't have separate processes set up. I defined the Processors and the Processes using the given decorators within Nest's IoC container. I thought things would queue up nicely. What I've seen is that jobs come in fast and my server can't keep up with all the requests and jobs.


Solution

  • Answering my own question a few months later for any future readers.

    The key to solving this was for me to understand the pattern of Queues in general. A Queue is made up of 3 parts in general:

    1. The actual Queue resting on a Data store somewhere (local memory, Redis, etc.)

    2. The Producer (adding jobs to the Queue)

    3. The Consumer (processing jobs that are on the Queue)

    All 3 parts can exist in the same script, for example by saving jobs to a Queue data structure in local memory and processing them one by one.

    Usually, we want to persist our jobs past the vagaries of local memory, so we will use a Memcache service like Redis to hold our Queue.

    Still, the Producer (Queue.add(job)) and the Consumer (Queue.process(function)) can exist on the same script. That is what the NestJS documentation demonstrates. The Producer and Consumer are both being executed by the same process, namely the NestJS application started in main.ts. However, they two can steal resources from each other, and in a web setting, it's very important for the server to be free at all times to respond to incoming requests. Not only that, the jobs might get shoved out of the way by the influx of requests.

    So instead, we could move the (Queue.process(function)) code to a different script and run it from a different shell (process). Essentially, that is what the Heroku background workers are. They are processes running in a different shell, so they do not interfere with each other.

    After understanding that, the rest was simple.

    Adding to the Queue

    Adding to the Queue is done in the NestJS application, upon an incoming request.

    @Injectable()
    export class SomeService {
      constructor(@InjectQueue('my_queue') private readonly myQueue: Queue) {}
    
      addToQueue(jobData: JobData) {
        await this.myQueue.add(jobData);
        
        return 'Added to Queue';
      }
    }
    

    Processing Jobs in the Queue

    In a separate script, say my-queue.process.ts:

    import Queue from 'bull';
    
    const myQueue = new Queue('my_queue', process.env.REDIS_URL);
    
    myQueue.process(async (job: JobData) => {
      await somePromise(job.data.id);
      return 'success';
    });
    

    The only caveat here is that any service defined in your NestJS application won't be available via injection, so we'll need to instantiate them ourselves and provide the necessary injections. For example:

    const myService = new SomeService();
    // myOtherService depends on myService. In Nest, it was automatically injected in the constructor. 
    const myOtherService = new SomeOtherService(myService);
    

    This can get messy. My script has a whole lot of that before getting to the Consumer. But it works, and I haven't found a clean way to inject all the Module Dependencies with a one-liner yet.

    That's it for our worker.

    In our Procfile we would have a line that looks something like this:

    worker: node dist/path/to/my-queue.process.ts

    That's the building block for understanding how to orchestrate this flow.

    Check out the following article on Heroku for a deeper understanding of what you can do with Heroku background workers: Heroku Article

    And even more insight can be derived from their example code: Github repo (check out worker.js)