Search code examples
javascriptdownloadphantomjscasperjs

How to download with CasperJS when the URL is being redirected


I've spent several days reading answers and testing and trying to figure out how to get to CasperJS to download a file when the URL is redirected. I reproduced my problem trying to download Firefox from https://firefox.com I get warnings:

[warning] [phantom] Loading resource failed with status=fail (HTTP 200): https://download.mozilla.org/?product=firefox-48.0.2-SSL&os=linux64&lang=en-US
[warning] [phantom] Loading resource failed with status=fail (HTTP 200): https://download-installer.cdn.mozilla.net/pub/firefox/releases/48.0.2/linux-x86_64/en-US/firefox-48.0.2.tar.bz2

and a 0 byte file called ?product=firefox-48.0.2-SSL&os=linux64&lang=en-US

The second warning tells me casperjs gets the new url (both download the same zip file if you navigate to them using a browser)

What am I missing to capture the downloaded file?

var casper = require('casper').create({
    pageSettings: {
        userAgent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
    }
});

casper.start().thenOpen("https://firefox.com", function () {
    this.viewport(1200, 800);
});

casper.then(function () {
    this.click('li.os_linux64 a');
    this.wait(3000);
});

casper.on('resource.received', function (resource) {
    if (resource.stage !== "end") {
        return;
    }
    if (resource.url.indexOf('download') > -1) {
        this.download(resource.url, 'out/' + new String(resource.url).substring(resource.url.lastIndexOf('/') + 1));
    }
});

casper.run();

Versions:

casperjs 1.1.3
phantomjs 2.1.1

Command-line:

casperjs --verbose --log-level=warning --ssl-protocol=any --ignore-ssl-errors=true --web-security=no script.js

Solution

  • I answered my own question. All the examples I saw had

    if (resource.stage !== "end") {
        return;
    }
    

    in the casper.on('resource.received'... function. Removing this caused the download to succeed. I'm not sure what it does (or now doesn't do).

    NOTE: I also had to use a smaller download file for testing as there seems to be a 30 second timeout on casperjs/phantomjs resource receiving. see CasperJS File Download Times Out After 30 Seconds