Search code examples
python-3.xgoogle-apigoogle-api-python-clientgoogle-custom-searchgoogle-image-search

How to download the images using google custom search api?


I have used google image api in python to download 500 images. After downloading few images, it is giving an bad request error. Below is the code

from google_images_search import GoogleImagesSearch
import os
from tqdm import tqdm
import time


# Set up Google Images Search
gis = GoogleImagesSearch(API_KEY, API_SECRET, validate_images=False)

def download_images_in_batches(location, output_folder):
    search_params = {
        'q': location,
        'num': 50,  # Number of images to download per batch
        'fileType': 'jpg|png' # File types to include in the search
    }

    # Create the folder for the location if it doesn't exist
    location_folder = os.path.join(output_folder, location)
    os.makedirs(location_folder, exist_ok=True)

    # Download images in batches
    total_images = 500  # Total number of images to download
    images_per_batch = 50  # Number of images to download per batch
    batches = total_images // images_per_batch

    for batch in tqdm(range(batches)):
        start_index = batch * images_per_batch + 1
        search_params['start'] = start_index
        search_params['num'] = 50
        time.sleep(1)
        # Perform the search and download the images
        gis.search(search_params=search_params)

        for index, image in enumerate(gis.results()):
            if index >= images_per_batch:
                break
            image.download(location_folder)

        gis.next_page()

    print(f"Downloaded {total_images} images for {location} in folder '{location_folder}'")

download_images_in_batches('query_to_search', 'destination_path')

I would like to download 500 images for a respective location and I am doing that in batches. After downloading 200 images, I am getting the below error

googleapiclient.errors.HttpError: <HttpError 400 when requesting https://customsearch.googleapis.com/customsearch/v1?cx=822fca83c44f645bb&q=US+Embassy+Baghdad&searchType=image&num=10&start=201&fileType=jpg%7Cpng&safe=off&key=AIzaSyD0pMnbiJmUnFactRxZvChEqY0i2G7gkFs&alt=json returned "Request contains an invalid argument.". Details: "[{'message': 'Request contains an invalid argument.', 'domain': 'global', 'reason': 'badRequest'}]">

My api has a limit that is greater than 500 requests per day. Can anyone tell me where am I doing wrong?


Solution

  • Check what is the count of images returned by the corresponding programmable search engine (https://programmablesearchengine.google.com) used. Probably there are fewer images returned.

    The count of results differ between Google search on browser and programmable search engine.

    I would encourage you to directly query the Google CSE than using an API like google_images_search. This is because Google CSE documentation states explicitly to query with different search strings and download 10 images per query.

    In your case I would have 50 different queries and each query resulting 10 images which is total of 500 at the end. You can formulate different queries following the documentation of Google CSE API (https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list).