Search code examples
node.jsredisqueuejobskue

How to make the client download a very large file that is genereted on the fly


I have an export function that read the entire database and create a .xls file with all the records. Then the file is sent to the client.

Of course, the time of export the full database requires a lot of time and the request will soon end in a timeout error.

What is the best solution to handle this case?

I heard something about making a queue with Redis for example but this will require two requests: one for starting the job that will generate the file and the second to download the generated file.

Is this possible with a single request from the client?


Solution

  • Excel Export:

    Use Streams. Following is a rough idea of what might be done:

    1. Use exceljs module. Because it has a streaming API aimed towards this exact problem.

      var Excel = require('exceljs')
      
    2. Since we are trying to initiate a download. Write appropriate headers to response.

      res.status(200);
      res.setHeader('Content-disposition', 'attachment; filename=db_dump.xls');
      res.setHeader('Content-type', 'application/vnd.ms-excel');
      
    3. Create a workbook backed by Streaming Excel writer. The stream given to writer is server response.

      var options = {
          stream: res, // write to server response
          useStyles: false,
          useSharedStrings: false
      };
      
      var workbook = new Excel.stream.xlsx.WorkbookWriter(options);
      
    4. Now, the output streaming flow is all set up. for the input streaming, prefer a DB driver that gives query results/cursor as a stream.

    5. Define an async function that dumps 1 table to 1 worksheet.

      var tableToSheet = function (name, done) {
          var str = dbDriver.query('SELECT * FROM ' + name).stream();
          var sheet = workbook.addWorksheet(name);
      
          str.on('data', function (d) {
              sheet.addRow(d).commit(); // format object if required
          });
      
          str.on('end', function () {
              sheet.commit();
              done();
          });
      
          str.on('error', function (err) {
              done(err);
          });
      }
      
    6. Now, lets export some db tables, using async module's mapSeries:

      async.mapSeries(['cars','planes','trucks'],tableToSheet,function(err){
         if(err){
           // log error
         }
         res.end();
      })
      

    CSV Export:

    For CSV export of a single table/collection module fast-csv can be used:

    // response headers as usual
    res.status(200);
    res.setHeader('Content-disposition', 'attachment; filename=mytable_dump.csv');
    res.setHeader('Content-type', 'text/csv');
    
    // create csv stream
    var csv = require('fast-csv');
    var csvStr = csv.createWriteStream({headers: true});
    
    // open database stream
    var dbStr = dbDriver.query('SELECT * from mytable').stream();
    
    // connect the streams
    dbStr.pipe(csvStr).pipe(res);
    

    You are now streaming data from DB to HTTP response, converting it into xls/csv format on the fly. No need to buffer or store the entire data in memory or in a file.