Using selenium, I was able to automate the download of a zip file and save it to a specified directory. When I try to unzip the file, however, I hit a snag where I can't seem to locate the recently downloaded file. If it helps, this is the block of code related to the downloading and unzipping process:
# Click on Map Link
driver.find_element_by_css_selector("input.linksubmit[value=\"▸ Map\"]").click()
# Download Data
driver.find_element_by_xpath('//*[@id="buttons"]/a[4]/img').click()
# Locate recently downloaded file
path = 'C:/.../Download'
list = os.listdir(path)
time_sorted_list = sorted(list, key=os.path.getmtime)
file_name = time_sorted_list[len(time_sorted_list)-1]
Specifically, this is my error:
Traceback (most recent call last):
File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-89-3f1d00dac284>", line 3, in <module>
time_sorted_list = sorted(list, key=os.path.getmtime)
File "C:\Users\...\AppData\Local\Continuum\Anaconda3\lib\genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'grid-m1b566d31a87cba1379e113bb93fdb61d5be5b128.zip'
I tried troubleshooting the code by deleting it and placing another file in the directory, and I was able to find the random file, but not the recently downloaded file. Can anyone tell me what's going on here?
First of all, do not use list
for a variable name. That hides the list
constructor from being readily available to use somewhere else in your program. Second, os.listdir
does not return the full path of the files in that directory. If you want the full path, there are two things you can do:
You can use os.path.join
:
import zipfile
path = 'C:/.../Download'
file_list = [os.path.join(path, f) for f in os.listdir(path)]
time_sorted_list = sorted(file_list, key=os.path.getmtime)
file_name = time_sorted_list[-1]
myzip = zipfile.ZipFile(file_name)
for contained_file in myzip.namelist():
if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
with myzip.open(contained_file) as f:
# save data to a CSV file
You can also use the glob
function from the glob
module:
from glob import glob
import zipfile
path = 'C:/.../Download'
file_list = glob(path+"/*")
time_sorted_list = sorted(file_list, key=os.path.getmtime)
file_name = time_sorted_list[-1]
myzip = zipfile.ZipFile(file_name)
for contained_file in myzip.namelist():
if all(n in contained_file.lower() for n in ('corn', 'irrigation', 'high', 'brazil')):
with myzip.open(contained_file) as f:
# save data in a CSV file
Either should work.