Search code examples
pythonplaywrightwebautomation

Get bytes of downloaded file in playwright (example in python)


I am using playwright in python to automate some tasks. At the end, I want to download a file and save it to my cloud storage.

For this, I need to get the bytes of the downloaded file, so I cant post/put these bytes to my cloud api.

I used the very straightforward download method in playwright, as such:

with page.expect_download() as download_info:
    page.get_by_role("button", name="Download PDF").click()

download = download_info.value   

I can easily save the file, but can't find anything about get the bytestream. I would expect something like this should be possible:

download_in_bytes = download.tobytes()

But there is no such method available in playwright.

Does somebody know a way? I can probably save the file first, and then open it again to get the bytes, but I'd rather do it without saving the file in between.


Solution

  • I could find a way to get the bytes directly, but the closest I could get is the following:

    # download file
    with page.expect_download() as download_info:
        page.get_by_role("button", name="Download PDF").click()
    
    download =  download_info.value
    
    # save in temporary file
    tempFilePath = tempfile.gettempdir()
    path = tempfile.NamedTemporaryFile(suffix='.pdf', dir=tempFilePath, delete=False).name
    download.save_as(path)
    
    # load file in bytes
    with open(path, 'rb') as f:
        bytes = f.read()
    
    # process bytes
    ...