Search code examples
javascriptnode.jsdownloadhttp-request

How can I access files that automatically download when visiting a page?


In short, I am trying to download a bunch of .png files from zamzar after converting them. There are over 200 files. To download them, you click on a link, it takes you to a page that triggers an automatic download. However, I'm trying to automate this process. Below is a small script I used:

var fs = require('fs'),
    request = require('request'),
    linkPrefix = "http://www.zamzar.com/downloadFile.php?uid-XXX";

var download = function(uri, filename, callback){
  request.head(uri, function(err, res, body){
    console.log('content-type:', res.headers['content-type']);

    request(uri).pipe(fs.createWriteStream(filename)).on('close', callback);
  });
};

var links = ["1.png", "2.png", ..., "200.png"]; //bunch of images

links.forEach(function(e){
  download(linkPrefix + e, e, function(){
    console.log('Done downloading image: ' +e);
  });
});

My question is, how on earth do I discard the html responses and only capture the image ones? I tried playing around with the Chrome Dev Tools to analyze the response but I'm failing miserably.


Solution

  • I figured out a fix. Usually, when an automatic file download doesn't start, there's a link to click. I used that instead of the "linkPrefix" variable used in the code above.

    linkPrefix = "http://www.zamzar.com/download.php?uid-XXX";  //download instead of downloadFile
    

    Works like a charm!