Search code examples
pythoncurldownload

Curl only save if not 404


I'm writing a python program for downloading some pictures of students at my school.

Here is my code: `

import os
count = 0
max_c = 1000000
while max_c >= count:
    os.system("curl http://www.tjoernegaard.dk/Faelles/ElevFotos/"+str(count)+".jpg > "+str(count)+".jpg")
    count=count+1

`

The problem is that i only want so save the jpg if the image exists on the server (not 404), and since i don't have all the image names on the server, i have to send a request for all images between 0 and 1000000, but not all images between 0 and 1000000 exists. So i only want so save the image if it exists on the server. How do i do this (ubuntu)?

Thank you in advance


Solution

  • import urllib2
    import sys
    
    for i in range(1000000):
      try:
        pic = urllib2.urlopen("http://www.tjoernegaard.dk/Faelles/ElevFotos/"+str(i)+".jpg").read()
        with open(str(i).zfill(7)+".jpg") as f:
          f.write(pic)
        print "SUCCESS "+str(i)
      except KeyboardInterrupt:
        sys.exit(1)
      except urllib2.HTTPError, e:
        print "ERROR("+str(e.code)+") "+str(i)
    

    should work, a 404 throws an exception