Search code examples
javascriptjqueryajaxweb-crawlerapify

How to properly crawl through webpage with infinite scroll?


How would I go about scraping data from a site with infinite scrolling?

What I'm trying to do is get all the data from Google Play Store(https://play.google.com/store/apps/category/GAME/collection/topselling_free?hl=en).

I'm using Apify(https://www.apify.com/) to crawl through the Google Play Store; I want to get all the links of the 'Top Free in games', then get all the title and details of the top games.

Unfortunately, the page loads new data when the user scrolls to the bottom of the page and I can't figure out how to get the new data.

This is my page function:

function pageFunction(context) {
var $ = context.jQuery;
if (context.request.label === "DETAIL") {
    context.skipLinks();
    if($('.details-info .info-container .info-box-top .document-title .id-app-title').length >= 1) {
        return {
            title: $('.details-info .info-container .info-box-top .document-title .id-app-title').text(),
            publisher: $('.details-info .info-container .info-box-top .document-subtitles .primary').text(),
            genre: $('.details-info .info-container .info-box-top .document-subtitles .category').text(),
            rating: $('.details-wrapper .details-section .rating-box .score').text()
        };
    }
} else {
    context.skipOutput();
    $.post("https://play.google.com/store/apps/category/GAME/collection/topselling_free?hl=en&authuser=0");
}

}

How can I load the additional games and get their links so that I can get their details on the game page?

An example or sample code would be greatly appreciated.


Solution

  • There is an option called Infinite scroll height under advanced settings to crawl content from infinite scroll. Check Apify documentation