Search code examples
pythonchromiumpyppeteer

Issue puppeteer/pyppeteer headless doesn't render SSR page


Trying to scrape a page using pyppeteer (https://loja.meo.pt/Equipamentos/gaming/Sony/PS5-Digital-Comando-DS-Plus-Card-365-dias?cor=Branco&modo-compra=PromptPayment) -- the screenshot works and i see the modal to consent cookies but the background is just plain white. I evaluated javascript to accept the cookies and i take another screenshot and the modal is gone but the page is still white (even post-reloads) not sure why this is not working, it works with puppeteer on nodejs (using the free open source streetmerchant) so must be something else..?

 url = "https://loja.meo.pt/Equipamentos/gaming/Sony/PS5-Digital-Comando-DS-Plus-Card-365-dias?cor=Branco&modo-compra=PromptPayment"

 browser = await launch(
        ignoreHTTPSErrors=True,
        headless=True,
        executablePath=os.getenv('CHROME_PATH'),
        args=[
          '--no-sandbox',
          '--disable-setuid-sandbox',
          '--disable-dev-shm-usage',
          '--headless',
          '--disable-gpu',
          '--ignore-certificate-errors'
        ]
)
page = await browser.newPage()

await page.setViewport({'width': 1920, 'height': 1280})

await page.goto(url, {'waitUntil': 'networkidle0'})

await page.screenshot({'path': 'screenshot.png'})

Some help would be awesome!

First screenshot that i take

Feels like the react app is not starting. Any help would be very welcomed!

SOLUTION

The script is working fine and all works correctly. Problem was with docker and volumes, had a few instances running the same script and it was using old screenshots / old scripts -- had to remove all them and restart the script and it worked on the first attempt.


Solution

  • As each puppeteer version has a list of fully compatible chromium versions and this may be the cause of your issue.

    It worked for me the same script as you shared, only using the default chromium that ships with puppeteer.

    from pyppeteer import launch
    import asyncio
    
    url = "https://loja.meo.pt/Equipamentos/gaming/Sony/PS5-Digital-Comando-DS-Plus-Card-365-dias?cor=Branco&modo-compra=PromptPayment"
    
    async def main():
        browser = await launch(
                ignoreHTTPSErrors=True,
                headless=True,
                # executablePath=os.getenv('CHROME_PATH'),
                args=[
                '--no-sandbox',
                '--disable-setuid-sandbox',
                '--disable-dev-shm-usage',
                '--headless',
                '--disable-gpu',
                '--ignore-certificate-errors'
                ]
        )
        page = await browser.newPage()
    
        await page.setViewport({'width': 1920, 'height': 1280})
    
        await page.goto(url, {'waitUntil': 'networkidle0'})
    
        await page.screenshot({'path': 'screenshot.png'})
    
    if __name__ == '__main__':
        asyncio.get_event_loop().run_until_complete(main())
    

    I used python 3.8.5 and those are the dependency versions:

    asyncio==3.4.3 
    pyppeteer==0.2.5