Search code examples
javascriptdomcheerio

Webscraping walmarts products with cheerio


I am trying to webscrape walmart's products. Here is the link I am trying to pull https://www.walmart.com/search/?query=&cat_id=91083 I am able to successfully scrape like 10 products from the page. Here is the code I am using.

const axios = require('axios');
const cheerio = require('cheerio');

axios.get('https://www.walmart.com/search/?query=&cat_id=91083').then( res => {
        var combino1 = [];
        const $ = cheerio.load(res.data);

        $('a.product-title-link').each( (index, element) => {
        const name = $(element)
        .first().text()
        combino1[index] = {name}
        })
        console.log(combino1);
    })

When I search the dom with a.product-title-link it shows 40 products. Why I am able to only grab 10 and not 40?


Solution

  • Your issue is that a call with axios will only get you the HTML provided from the server

    this means that any asynchronous calls that fetch products from other parts of their system, will never be in that request

    a simple output of the data received to a new file, will show this fact

    const fs = require('fs')
    ...
    fs.writeFileSync('./data.html', res.data)
    

    opening the new data.html file will only output 10 as the number of product-title-link found

    enter image description here

    For that you can't use axios but a web scraper library, for example, Puppeteer as with it, you can wait for all products to be loaded prior to transverse the DOM at that given time.