Search code examples
pythonpython-3.xdirectoryurllib2urllib

urllib downloading contents of an online directory


I'm trying to make a program that will open a directory, then use regular expressions to get the names of powerpoints and then create files locally and copy their content. When I run this it appears to work, however when I actually try to open the files they keep saying the version is wrong.

from urllib.request import urlopen
import re

urlpath = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
string = urlpath.read().decode('utf-8')

pattern = re.compile('ch[0-9]*.ppt') #the pattern actually creates duplicates in the list

filelist = pattern.findall(string)
print(filelist)

for filename in filelist:
    remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
    localfile = open(filename,'wb')
    localfile.write(remotefile.read())
    localfile.close()
    remotefile.close()

Solution

  • This code worked for me. I just modified it a little because yours was duplicating each ppt file.

    from urllib2 import urlopen
    import re
    
    urlpath =urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
    string = urlpath.read().decode('utf-8')
    
    pattern = re.compile('ch[0-9]*.ppt"') #the pattern actually creates duplicates in the list
    
    filelist = pattern.findall(string)
    print(filelist)
    
    for filename in filelist:
        filename=filename[:-1]
        remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
        localfile = open(filename,'wb')
        localfile.write(remotefile.read())
        localfile.close()
        remotefile.close()