Search code examples
pythonhttprequesturllib2

Try to download image from image url, but get html instead


similar to Try to scrape image from image url (using python urllib ) but get html instead , but the solution does not work for me.

from BeautifulSoup import BeautifulSoup
import urllib2
import requests

img_url='http://7-themes.com/data_images/out/79/7041933-beautiful-backgrounds-wallpaper.jpg'

r = requests.get(img_url, allow_redirects=False)

headers = {}
headers['Referer'] = r.headers['location']

r = requests.get(img_url, headers=headers)
with open('7041933-beautiful-backgrounds-wallpaper.jpg', 'wb') as fh:
    fh.write(r.content)

the downloaded file is still a html page, not an image.


Solution

  • Your referrer was not being set correctly. I have hard coded the referrer and it works fine

    from BeautifulSoup import BeautifulSoup
    import urllib2
    import requests
    
    img_url='http://7-themes.com/data_images/out/79/7041933-beautiful-backgrounds-wallpaper.jpg'
    
    r = requests.get(img_url, allow_redirects=False)
    
    headers = {}
    headers['Referer'] = 'http://7-themes.com/7041933-beautiful-backgrounds-wallpaper.html'
    
    r = requests.get(img_url, headers=headers, allow_redirects=False)
    with open('7041933-beautiful-backgrounds-wallpaper.jpg', 'wb') as fh:
        fh.write(r.content)