I try to download images, but they become corrupted for some reason? For example: This is an image I want to get.
And the result is this
My test code is:
import urllib2
def download_web_image(url):
request = urllib2.Request(url)
img = urllib2.urlopen(request).read()
with open ('test.jpg', 'w') as f: f.write(img)
download_web_image("http://upload.wikimedia.org/wikipedia/commons/8/8c/JPEG_example_JPG_RIP_025.jpg")
Why is this and how do I fix this?
You are opening 'test.jpg' file in the default (text) mode, which causes Python to use the "correct" newlines on Windows:
In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n. When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings.
Of course, JPEG files are not text files, and 'fixing' the newlines will only corrupt the image. Instead, open the file in binary mode:
with open('test.jpg', 'wb') as f:
f.write(img)
For more details, see the documentation.