Search code examples
pythonhtmlimageweb-scrapingurllib

Python WebScraper - object has no attribute 'urlretrieve'


I am trying to create a python webscraper that downloads a certain amount of images from a url, to my current directory. However for the following code:

urllib.request.urlretrieve(each, filename)

It is saying that: AttributeError: 'function' object has no attribute 'urlretrieve' when running the program

Here is the full code:

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup


url = 'https://unsplash.com/s/photos/download'

 

def download_imgs(url, amountOfImgs):
    html = urlopen(url).read()
    #parsing the html from the url
    page_soup = soup(html, "html.parser")
    images = [img for img in page_soup.findAll('img')]
    counter = 0
    #compiling the unicode list of image links
    image_links = [each.get('src') for each in images]

    for each in image_links:
        if(counter <= amountOfImgs):
            filename = each.split('/')[-1]
            urllib.request.urlretrieve(each, filename)
            counter += 1
        else:
            return image_links




print(download_imgs(url, 5))

Solution

  • It looks like when you imported just URLOpen, you missed everything else.

    I did it a bit differently, I got the html using the requests.get method, and removed the need for url open, you could just do import urlopen, urlretrieve

    if you want to use mine, I know it worked,

    import urllib.request
    from bs4 import BeautifulSoup as soup
    import requests
    
    url = 'https://unsplash.com/s/photos/download'
    
    
    
    def download_imgs(url, amountOfImgs):
        req=requests.get(url)
        html=req.text
        #parsing the html from the url
        page_soup = soup(html, "html.parser")
        images = [img for img in page_soup.findAll('img')]
        counter = 0
        #compiling the unicode list of image links
        image_links = [each.get('src') for each in images]
    
        for each in image_links:
            if(counter <= amountOfImgs):
                filename = each.split('/')[-1]
                urllib.request.urlretrieve(each, filename)
                counter += 1
            else:
                return image_links
    
    
    
    
    print(download_imgs(url, 5))