python google-gemini google-generativeai

How to send parallel request to Google Gemini?

I have 107 images and I want to extract text from them, and I am using Gemini API, and this is my code till now:

# Gemini Model
model = genai.GenerativeModel('gemini-pro-vision', safety_settings=safety_settings)

# Code
images_to_process = [os.path.join(image_dir, image_name) for image_name in os.listdir(image_dir)] # list of 107 images 

prompt = """Carefully scan this images: if it has text, extract all the text and return the text from it. If the image does not have text return '<000>'."""

for image_path in tqdm(images_to_process):
    img = Image.open(image_path)
    output = model.generate_content([prompt, img])
    text = output.text

    print(text)

In this code, I am just taking one image at a time and extracting text from it using Gemini.

Problem - I have 107 images and this code is taking ~10 minutes to run. I know that Gemini API can handle 60 requests per minute. How to send 60 images at the same time? How to do it in batch?

Solution

2024-10 update: I've added a Cookbook Quickstart on asynchronous requests to show how this works. The advice below is still correct.

In synchronous Python you can use something like a ThreadPoolExecutor to make your requests in separate threads.

The Gemini Python SDK has an async API though, which can be a bit more natural:

$ python -m asyncio

>>> import asyncio
>>> import google.generativeai as genai
>>> import PIL

>>> model = genai.GenerativeModel('gemini-pro-vision')
>>> imgs = ['/path/img.jpg', ...]
>>> prompt = "..."

>>> async def process_image(img: str) -> str:
...   r = await model.generate_content_async([prompt, PIL.Image.open(img)])
...   # TODO: error handling
...   return r.text

>>> jobs = asyncio.gather(*[process_image(img) for img in imgs])
>>> results = await jobs  # or run_until_complete(jobs)
>>> results
['text is here', ...]

This uses the implicit asyncio REPL event loop, in a real app you'll need to set up and use your own event loop.