Search code examples
pythonseleniumunzip

Python - Can't locate downloaded file to unzip


Using selenium, I was able to automate the download of a zip file and save it to a specified directory. When I try to unzip the file, however, I hit a snag where I can't seem to locate the recently downloaded file. If it helps, this is the block of code related to the downloading and unzipping process:

# Click on Map Link
driver.find_element_by_css_selector("input.linksubmit[value=\"▸ Map\"]").click()
# Download Data
driver.find_element_by_xpath('//*[@id="buttons"]/a[4]/img').click()

# Locate recently downloaded file
path = 'C:/.../Download'
list = os.listdir(path)
time_sorted_list = sorted(list, key=os.path.getmtime)
file_name = time_sorted_list[len(time_sorted_list)-1]

Specifically, this is my error:

Traceback (most recent call last):
  File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-89-3f1d00dac284>", line 3, in <module>
    time_sorted_list = sorted(list, key=os.path.getmtime)
  File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\genericpath.py", line 55, in getmtime
    return os.stat(filename).st_mtime
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'grid-m1b566d31a87cba1379e113bb93fdb61d5be5b128.zip'

I tried troubleshooting the code by deleting it and placing another file in the directory, and I was able to find the random file, but not the recently downloaded file. Can anyone tell me what's going on here?


Solution

  • First of all, do not use list for a variable name. That hides the list constructor from being readily available to use somewhere else in your program. Second, os.listdir does not return the full path of the files in that directory. If you want the full path, there are two things you can do:

    You can use os.path.join:

    import zipfile
    
    
    path = 'C:/.../Download'
    file_list = [os.path.join(path, f) for f in os.listdir(path)]
    time_sorted_list = sorted(file_list, key=os.path.getmtime)
    file_name = time_sorted_list[-1]
    myzip = zipfile.ZipFile(file_name)
    for contained_file in myzip.namelist():
        if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
            with myzip.open(contained_file) as f:
                # save data to a CSV file
    

    You can also use the glob function from the glob module:

    from glob import glob
    import zipfile
    
    
    path = 'C:/.../Download'
    file_list = glob(path+"/*")
    time_sorted_list = sorted(file_list, key=os.path.getmtime)
    file_name = time_sorted_list[-1]
    
    myzip = zipfile.ZipFile(file_name)
    for contained_file in myzip.namelist():
        if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
            with myzip.open(contained_file) as f:
                # save data in a CSV file
    

    Either should work.