Search code examples
javascriptajaxclickcasperjsimage-loading

Casperjs click() doesn't trigger click event correctly


I'm using CasperJS for web scraping, but I encountered some problems on scraping the page I describe below.

The html of the page looks like this:

<img id="trigger">
<img id="cur_img_xxx" class="show">
<img id="cur_img_yyy" class="cache">

All <img> elements share the same dimensions, and "#trigger" is on the topmost layer. When an image has .show class, it will display on the page; when it's .cache class, it will get downloaded but hide. In this way, when user click on the image, which is actually the trigger, next image will show and a new image will be downloaded via AJAX. The resulted html becomes:

<img id="trigger">
<img id="cur_img_xxx" class="cache">
<img id="cur_img_yyy" class="show">
<img id="cur_img_zzz" class="cache">

I guess it's a good strategy to increase the UX, and good for avoiding web scraping, but I still want to scrape :P

I tried $("#trigger").click() in the web console, and the images get navigated and downloaded corrected. However, when I tried to simulate this process using CasperJS, neither the navigation nor the image downloading worked. Please refer to the code:

var casper = require ("casper").create({
  clientScripts:  [
    'include/jquery.js'
  ],
  pageSettings: {
    loadImages:  false, // this won't affect since this will only forbid
    loadPlugins: false  // inline imgs from loading, but all imgs in this
  },                    // page are loaded dynamically
  verbose: true
});

casper.start("http://www.example.com/1234.html");

casper.then(function () {
  console.log("Connected! Current Url = " + this.getCurrentUrl());
});

casper.then(function () {
  // findInitialImgs will find imgs that have already been loaded 
  imgs = this.evaluate(findInitialImgs);

  this.waitForSelector("#image_trigger").thenClick("#image_trigger");

  var next = this.evaluate(function () {
    return $("img[id^='cur_img_']").last().attr("href");
  });

  console.log(next);
});

casper.run(function () {
  this.echo('End').exit();
});

By right, after "#trigger" is clicked, the last entry would be different, i.e. from <img id="cur_img_yyy"> becomes <img id="cur_img_zzz">. However, next still held <img id="cur_img_yyy">. Did I do anything wrong?


Solution

  • It seems to be JQuery's problem. After I deleted JQuery injection, and changed $("img[id^='cur_img_']").last().attr("href") to

    var imgs = document.querySelectorAll("img[id^='cur_img_']");
    return imgs[imgs.length - 1].getAttribute("href");
    

    Everything works fine.

    Then I found this answer very powerful: CasperJS click event having AJAX call

    So confirmed that the original scripts will be broken when you inject JQuery to pages that use $ as JQuery.