I'm trying to download a large number of files that all share a common string (DEM
) from an FTP sever. These files are nested inside multiple directories. For example, Adair/DEM*
and Adams/DEM*
The FTP sever is located here: ftp://ftp.igsb.uiowa.edu/gis_library/counties/
and requires no username and password.
So, I'd like to go through each county and download the files containing the string DEM
.
I've read many questions here on Stack Overflow and the documentation from Python, but cannot figure out how to use ftplib.FTP()
to get into the site without a username and password (which is not required), and I can't figure out how to grep or use glob.glob
inside of ftplib or urllib.
Thanks in advance for your help
Ok, seems to work. There may be issues if trying to download a directory, or scan a file. Exception handling may come handy to trap wrong filetypes and skip.
glob.glob
cannot work since you're on a remote filesystem, but you can use fnmatch
to match the names
Here's the code: it download all files matching *DEM*
in TEMP directory, sorting by directory.
import ftplib,sys,fnmatch,os
output_root = os.getenv("TEMP")
fc = ftplib.FTP("ftp.igsb.uiowa.edu")
fc.login()
fc.cwd("/gis_library/counties")
root_dirs = fc.nlst()
for l in root_dirs:
sys.stderr.write(l + " ...\n")
#print(fc.size(l))
dir_files = fc.nlst(l)
local_dir = os.path.join(output_root,l)
if not os.path.exists(local_dir):
os.mkdir(local_dir)
for f in dir_files:
if fnmatch.fnmatch(f,"*DEM*"): # cannot use glob.glob
sys.stderr.write("downloading "+l+"/"+f+" ...\n")
local_filename = os.path.join(local_dir,f)
with open(local_filename, 'wb') as fh:
fc.retrbinary('RETR '+ l + "/" + f, fh.write)
fc.close()