Search code examples
node.jsdiscord.jspuppeteerreddit

Scraping titles + links from /r/GameDeals that contain the word 'FREE' in discord.js?


I'm super new to Javascript and programming in general, and I've found an outlet where I can practice it and share/create features for me and my friends in our Discord channel. I'm trying to setup a scraper that pulls titles w/ links containing the word 'Free' from the /r/GameDeals subreddit. So far, through resources I've found online I've been able to get the first 25 links:

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const [page] = await browser.pages();

    await page.goto('https://www.reddit.com/r/GameDeals/', { waitUntil: 'networkidle0' });
    const links = await page.evaluate(async () => {
        window.scrollBy(0, document.body.clientHeight);
        await new Promise(resolve => setTimeout(resolve, 1)); 
        return [...document.querySelectorAll('.scrollerItem div:nth-of-type(2) article div div:nth-of-type(3) a')]
            .map((el) => el.href);
    });
    bot.on('message', msg=>{
        if(msg.content === "gamedeals"){
            msg.reply(links, links.length);
            }
        })

    await browser.close();
})(); 

I have a very limited understanding of what HTML classes I need to specific to get what I need and adding the filter of "contains the word: FREE" is a whole 'nother story.

Any guidance would be greatly appreciated.

I'm using puppeteer, but someone suggested I try using Reddit's JSON API by using 'reddit.com/r/GameDeals.json' but I'm unsure how to even begin.


Solution

  • If you want to find only links that contain word "free", you need to filter the nodes that you've got within page.evaluate:

    [...document.querySelectorAll('.scrollerItem div:nth-of-type(2) article div div:nth-of-type(3) a')] // <-- we've got all the links
      .filter((el) => el.innerText.toLowerCase().includes('free') ) // <-- only keep those with word "free"
      .map((el) => el.href);