Search code examples
pythonpython-3.xseleniumfirefoxgeckodriver

How to download file with Selenium and Firefox in Python?


I am trying to download a file with Selenium, Geckodriver and Firefox all controlled from Python. The file actually get downloaded but driver still processing something even after file gets downloaded.

Code I use to download a file:

from selenium import webdriver

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.dir", downloaddir)
fp.set_preference("browser.download.useDownloadDir", True)
fp.set_preference("browser.download.viewableInternally.enabledTypes", "")
fp.set_preference("browser.download.manager.useWindow", False)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.manager.closeWhenDone", True);
fp.set_preference('browser.helperApps.neverAsk.openFile', "application/zip")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/zip")
fp.set_preference("pdfjs.disabled", True)

driver = webdriver.Firefox(firefox_profile=fp)
driver.get('http://speedtest.tele2.net/10MB.zip')
driver.close() # this code never gets called

Does anyone know what's going on? I know there is workaround when you click on element. The problem is I work with composed url which cannot be clicked but needs to be accessed directly.

Versions (linux):
  Gecko 0.29.1 
  Firefox 89.0
  Python 3.9.5

Update

There is implicit timeout configured to 5min and after that it will fail.

So my question is: Is there a way how to download a file directly implemented in selenium without raising any kind of error (in ideal case of course)?


Solution

  • As suggested by @cards it is more convenient to use requests or urllib for this kind of work. You can use selenium to paginate or click, and then use requests by inspecting the website HTML.

    import requests
    
    # retrieve the web content
    response = requests.get("http://speedtest.tele2.net/10MB.zip")
    
    # save it as local file
    with open("filename.zip", "wb") as file:
      file.write(response.content)
    
    

    P.S. The zip file that gets downloaded by your provided URL is damaged.