Search code examples
javascriptphantomjscasperjsnoscript

CasperJS scraping foiled by <noscript> tag


I'm using CasperJS to scrape a website. The page source has a <noscript> tag, and therefore is not showing the page I need to scrape, because it claims I don't have JavaScript enabled.

javascriptEnabled is true by default in CasperJS, but I added it to my initialization anyway, to no avail.

Any work arounds to fix this issue? It might also be an issue with PhantomJS...


Solution

  • Ok this issue has been fixed -- I did the following, if anyone has any questions. The HTML was rendered by the JavaScript, which took a long time to load, so open it like you would normally in a browser, and find an element that only appears when the javascript loads -- note doing view source doesn't work you have to inspect element (you get current DOM).

    I then did:

    casper.waitForSelector('.SOME_CLASS', function() {
        this.echo(this.getHTML('.SOME_CLASS'));
        this.echo(this.getElementInfo('.SOME_CLASS').text);
    });
    

    This allows the page to stop and load the javascript.