Search code examples
openai-apiazure-openai

Query @azure/openai with images?


On chat.openai.com I can upload an image and ask chatgpt a question about it, with the existing openai and @azure/openai api however there doesn't seem to be a way to do this? The ChatCompletion object in both cases only take text prompts.

Is this feautre supported at an api level?


Solution

  • With OpenAI you just include your image as part of the message that you supply. Here is a piece from the code I use, which works whether you have an image or not:

    if image != '':
        # Get base64 string
        base64_image = encode_image(image)
        content = [
            {
                "type": "text",
                "text": your_prompt
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{base64_image}"
                }
            }
        ]
    else:
        content = your_prompt
    messages.append({"role": "user", "content": content})
    

    And then

    payload = {
        "model": model_name,
        "temperature": temperature,
        "max_tokens": tokens,
        "messages": messages
    }
    

    where encode_image() is defined:

    def encode_image(image_path):
        with open(image_path, "rb") as image_file:
            return base64.b64encode(image_file.read()).decode('utf-8')
    

    Currently you need to target OpenAI model gpt-4-vision-preview. Update: As @Michael suggests, it also works with gpt-4o.