Search code examples
node.jsnode-streams

Get hash of ReadStream and output data of stream


I have a ReadStream that I want to read multiple times. The readStream is created with fs.createReadStream.

First time I'm using it to get it's md5 hash, I'm doing it with module hasha, function fromStream, and the second time I'm using it with FormData to upload a file to web hosting.

How can I use this one ReadStream to do both of these things?

readStream = fs.createReadStream("/tmp/test.txt");
hash = await hasha.fromStream(readStream, hashOptions);
readStream.on("data", (chunk) => console.log("data chunk", chunk)).on("end", () => console.log("finished"));

It's not loging the content to the console as it should, probably because in the hasha.fromStream it's pipe-ing the stream. If I don't execute hasha.fromStream it's working fine, the chunks are logged.

The module I'm using, called hasha is on github: https://github.com/sindresorhus/hasha/blob/master/index.js#L45

I don't want to save the data to buffer before getting hash, because I'll be using it with large files.

I have also made a runkit script showing my problem, you can play with it there: https://runkit.com/5942fba4653ae70012196b77/5942fba4653ae70012196b78


Solution

  • Here's a standalone example on how to "fork" a stream so you can pipe it to two destinations:

    const PassThrough = require('stream').PassThrough;
    
    async function hashAndPost(stream) {
      let pass1 = new PassThrough();
      let pass2 = new PassThrough();
    
      stream.pipe(pass1);
      stream.pipe(pass2);
    
      // Destination #1
      pass1.on('data', chunk =>
        console.log('data chunk', chunk.toString())
      ).on('end', () =>
        console.log('finished')
      );
    
      // Destination #2
      let hash = await hasha.fromStream(pass2, { algorithm : 'md5' });
      console.log('hash', hash);
    };