Search code examples
javascriptseleniumweb-scrapingphantomjscasperjs

Get yahoo comments with web scraping


I'm trying to get the news comments on yahoo, where there is a link "See reactions", with the following id: "caascommtbar-wide" and tried to get the element with CasperJS, Selenium, ScrapySharp, to click on the link and display the comments, but in those tools you never find the element and I've even tried using the XPath

CasperJS:

       casper.then (function () {
            if (this.exists ('a.caascommtbar-anchor')) {
                  this.echo ("It exists");
            } else
                 this.echo ("It Does not Exist");
       });

       casper.then (function () {
       // Click on 1st result link
            this.click ('a.caascommtbar-anchor');
      });

Selenium:

driver.FindElement (By.Id ("caascommtbar-anchor")). Click ();

Does anyone know why you can not access this part of the HTML code where the comments are located?

It should be noted that the same thing happens to me when trying to access the Facebook comments contained in the news forums.


Solution

  • As Isaac said the part of pages are loaded asynchronously, so you should implement waitFor steps in your code. Here is the code that does just that.

    var url = "https://es-us.vida-estilo.yahoo.com/instagram-cierra-la-cuenta-de-una-modelo-por-ser-gorda-103756072.html";
    var casper = require('casper').create({
      viewportSize: {width: 1280, height: 800},
    });
    
    casper.start(url, function() {
      this.echo('Opened page');
    });
    
    casper.waitForSelector('a.comments-title', function() {
      this.click('.comments-title');
    });
    
    casper.waitForSelector('ul.comments-list > li', function() {
      this.echo(this.getHTML('ul.comments-list'));
    });
    
    casper.run();
    

    Hope that helps