Search code examples
javascriptnode.jsstreamingnode-streams

How to limit flow between streams in NodeJS


I have a readStream piped to writeStream. Read stream reads from an internet and write stream writes to my local database instance. I noticed that read speed is much faster than write speed and my app memory usage rises until it reaches

JavaScript heap out of memory

I suspect that it accumulates read data in the NodeJS app like this: enter image description here

How can I limit read stream so it reads only what write stream is capable of writing at the given time?


Solution

  • Ok so long story short - mechanism you need to be aware of to solve these kind of issues is backpressure. It is not a problem when you are using standard node's pipe(). I am using custom fan-out to multiple streams thus it happened

    You can read about it here https://nodejs.org/en/docs/guides/backpressuring-in-streams/

    This solution is not ideal as it will block read-stream whenever any of fan-out write streams is blocked but it gives general idea on how to approach this problem

    combinedStream.pipe(transformer).on('data', async (data: DbObject) => {
      const writeStream = dbClient.getStreamForTable(data.table);
      if (!writeStream.write(data.csv)) {
        combinedStream.pause();
        await new Promise((resolve) => writeStream.once('drain', resolve));
        combinedStream.resume();
      }
    });