Search code examples
pythongoogle-geminigoogle-generativeai

How to send parallel request to Google Gemini?


I have 107 images and I want to extract text from them, and I am using Gemini API, and this is my code till now:

# Gemini Model
model = genai.GenerativeModel('gemini-pro-vision', safety_settings=safety_settings)

# Code
images_to_process = [os.path.join(image_dir, image_name) for image_name in os.listdir(image_dir)] # list of 107 images 

prompt = """Carefully scan this images: if it has text, extract all the text and return the text from it. If the image does not have text return '<000>'."""

for image_path in tqdm(images_to_process):
    img = Image.open(image_path)
    output = model.generate_content([prompt, img])
    text = output.text

    print(text)

In this code, I am just taking one image at a time and extracting text from it using Gemini.

Problem - I have 107 images and this code is taking ~10 minutes to run. I know that Gemini API can handle 60 requests per minute. How to send 60 images at the same time? How to do it in batch?


Solution

  • 2024-10 update: I've added a Cookbook Quickstart on asynchronous requests to show how this works. The advice below is still correct.


    In synchronous Python you can use something like a ThreadPoolExecutor to make your requests in separate threads.

    The Gemini Python SDK has an async API though, which can be a bit more natural:

    $ python -m asyncio
    
    >>> import asyncio
    >>> import google.generativeai as genai
    >>> import PIL
    
    >>> model = genai.GenerativeModel('gemini-pro-vision')
    >>> imgs = ['/path/img.jpg', ...]
    >>> prompt = "..."
    
    >>> async def process_image(img: str) -> str:
    ...   r = await model.generate_content_async([prompt, PIL.Image.open(img)])
    ...   # TODO: error handling
    ...   return r.text
    
    >>> jobs = asyncio.gather(*[process_image(img) for img in imgs])
    >>> results = await jobs  # or run_until_complete(jobs)
    >>> results
    ['text is here', ...]
    

    This uses the implicit asyncio REPL event loop, in a real app you'll need to set up and use your own event loop.

    See also TaskGroups.