Search code examples
pythonpython-3.xweb-scrapingpyppeteer

Can't create a loop to fetch all the titles from a webpage


I've written a script in python in combination with pyppeteer to scrape the titles and links to the titles of different posts from a webpage. The thing is when I run my script, it can parse the first title and the link of the first post there. My intention is to create a loop to get them all. As I'm very new to work using this library, I can't find any idea how can I create a loop. Any help will be appreciated.

My script so far:

import asyncio
from pyppeteer import launch

async def get_titles_n_links():
    wb = await launch(headless=True)
    page = await wb.newPage()
    await page.goto('https://stackoverflow.com/questions/tagged/web-scraping')

    element = await page.querySelector('.question-hyperlink')
    title = await page.evaluate('(element) => element.textContent', element)
    link = await page.evaluate('(element) => element.href', element)
    print(f'{title}\n{link}\n')
    await wb.close()

asyncio.get_event_loop().run_until_complete(get_titles_n_links())

Solution

  • Your code will be like:

    import asyncio
    from pyppeteer import launch
    
    async def get_titles_n_links():
        wb = await launch(headless=True)
        page = await wb.newPage()
        await page.goto('https://stackoverflow.com/questions/tagged/web-scraping')
    
        elements = await page.querySelectorAll('.question-hyperlink')
    
        for element in elements:
            title = await page.evaluate('(element) => element.textContent', element)
            link = await page.evaluate('(element) => element.href', element)
            print(f'{title}\n{link}\n')
    
        await wb.close()
    
    asyncio.get_event_loop().run_until_complete(get_titles_n_links())