Search code examples
python-3.xplaywrightpyppeteerplaywright-python

Downloading pdf files using playwright-python


I'm trying to download PDF files that are rendered in a browser (not shown as a popup or downloaded) using playwright (Python). No URL is exposed, so you can't simply scrape a link and download it using requests.get("file_url").

I've tried:

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.newPage(acceptDownloads=True)
    
        await page.goto("www.some_landing_page.com")
            
        async with page.expect_download() as download_info:
            await page.click("a")     # selector to a pdf file
        
        download = download_info.value
        path = download.path()

I've also tried page.expect_popup() with no luck either. My understanding is that this can't be done using pyppeteer, but would welcome a solution this way as well, if possible.


Solution

  • For anyone with a similar problem, try using firefox or webkit instead of chromium for the browser. Provided a work-around for me.