Search code examples
pythonimageweb-scrapingbeautifulsoupdirectory

How to save images to a folder from web scraping? (Python)


How do I make it so that each image I garnered from web scraping is then stored to a folder? I use Google Colab currently since I am just practicing stuff. I want to store them in my Google Drive folder.

This is my code for web scraping:

import requests 
from bs4 import BeautifulSoup 

def getdata(url):
  r = requests.get(url)
  return r.text

htmldata = getdata('https://www.yahoo.com/')
soup = BeautifulSoup(htmldata, 'html.parser')

imgdata = []
for i in soup.find_all('img'):
  imgdata = i['src']
  print(imgdata)

Solution

  • I created a pics folder manually in the folder where the script is running to store the pictures in it. Than i changed your code in the for loop so its appending urls to the imgdata list. The try exceptblock is there because not every url in the list is valid.

    import requests 
    from bs4 import BeautifulSoup 
    
    def getdata(url):
        r = requests.get(url)
        return r.text
    
    htmldata = getdata('https://www.yahoo.com/')
    soup = BeautifulSoup(htmldata, 'html.parser')
    
    imgdata = []
    for i in soup.find_all('img'):
        imgdata.append(i['src']) # made a change here so its appendig to the list
        
    
    
    filename = "pics/picture{}.jpg"
    for i in range(len(imgdata)):
        print(f"img {i+1} / {len(imgdata)+1}")
        # try block because not everything in the imgdata list is a valid url
        try:
            r = requests.get(imgdata[i], stream=True)
            with open(filename.format(i), "wb") as f:
                f.write(r.content)
        except:
            print("Url is not an valid")