Search code examples
node.jshttpstreampipemultipart

Node: How can I use pipe and change one file from a multipart


I have a http service that needs to redirect a request, I am not using streams because I deal with big files in multipart and it overwhelms RAM or disk(see How do Node.js Streams work?)

Now I am using pipes and it works, the code is something of this sort:

var Req = getReq(response);
request.pipe(Req);

The only shortcoming of this is that in this multipart I resend in the pipe contains one JSON file that needs a few fields to be changed.

Can I still use a pipe and change one file in the piped multipart?


Solution

  • You can do this using a Transform Stream.

    var Req = getReq(response);
    var transformStream = new TransformStream();
    
    // the boundary key for the multipart is in the headers['content-type']
    // if this isn't set, the multipart request would be invalid
    Req.headers['content-type'] = request.headers['content-type'];
    
    // pipe from request to our transform stream, and then to Req
    // it will pipe chunks, so it won't use too much RAM
    // however, you will have to keep the JSON you want to modify in memory
    request.pipe(transformStream).pipe(Req);
    

    Transform Stream code:

    var Transform = require('stream').Transform,
        util = require('util');
    
    var TransformStream = function() {
      Transform.call(this, {objectMode: true});
    };
    util.inherits(TransformStream, Transform);
    
    TransformStream.prototype._transform = function(chunk, encoding, callback) {
    
    // here should be the "modify" logic;
    
    // this will push all chunks as they come, leaving the multipart unchanged
    // there's no limitation on what you can push
    // you can push nothing, or you can push an entire file
      this.push(chunk);
      callback();
    };
    
    TransformStream.prototype._flush = function (callback) {
        // you can push in _flush
        // this.push( SOMETHING );      
        callback();
    };
    

    In the _transform function, your logic should be something like this:

    1. If, in the current chunk, the JSON you want to modify begins

      <SOME_DATA_BEFORE_JSON> <MY_JSON_START>

      then this.push(SOME_DATA_BEFORE_JSON); and keep MY_JSON_START in a local var

    2. While your JSON hasn't ended, append the chunk to your local var

    3. If, in the current chunk, the JSON ends:

      <JSON_END> <SOME_DATA_AFTER_JSON>

      then add JSON_END to your var, do whatever changes you want, and push the changes: this.push(local_var); this.push(SOME_DATA_AFTER_JSON);

    4. If current chunk has nothing of your JSON, simply push the chunk

      this.push(chunk);

    Other than that, you may want to read the multipart format. SOME_DATA_BEFORE_JSON from above will be:

    --frontier
    Content-Type: text/plain
    
    <JSON_START>
    

    Other than Content-Type, it may contain the filename, encoding, etc. Something to keep in mind the chunks may end wherever (could end in the middle of the frontier). The parsing could get quite tricky; I would search for the boundary key (frontier), and then check if the JSON starts after that. There would be two cases:

    1. chunk: <SOME_DATA> --frontier <FILE METADATA> <FILE_DATA>
    2. chunk 1: <SOME_DATA> --fron chunk 2: ier <FILE METADATA> <FILE_DATA>

    Hope this helps!