Search code examples
jquerynode.jsweb-scrapingcheerio

Reducing Redundancy with NodeJS and Cheerio


I was just wondering how I would reduce redundancy in these two web scrapers, since I don't want it to request the website twice. I'm new to this and not very familiar with the syntax. Here's the snippet of code:

            request(website_url, function(err, resp, body) {
            var $ = cheerio.load(body);
            $('.title').each(function(){
                var title = $(this).children('h2').children('span').text();
                titles.push(title);
            });

        request(website_url, function(err, resp, body) {
            var $ = cheerio.load(body);
            $('.post-box-excerpt').each(function(){
                var caption = $(this).children('p').text();
                captions.push(caption);
            });

Solution

  • The easiest way is just to use make one single call to the api:

    request(website_url, function (err, resp, body) {
      var $ = cheerio.load(body);
    
      $('.title').each(function () {
        var title = $(this).children('h2').children('span').text();
        titles.push(title);
      });
    
      $('.post-box-excerpt').each(function () {
        var caption = $(this).children('p').text();
        captions.push(caption);
      });
    });