On chat.openai.com I can upload an image and ask chatgpt a question about it, with the existing openai and @azure/openai api however there doesn't seem to be a way to do this? The ChatCompletion object in both cases only take text prompts.
Is this feautre supported at an api level?
With OpenAI you just include your image as part of the message that you supply. Here is a piece from the code I use, which works whether you have an image or not:
if image != '':
# Get base64 string
base64_image = encode_image(image)
content = [
{
"type": "text",
"text": your_prompt
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
}
]
else:
content = your_prompt
messages.append({"role": "user", "content": content})
And then
payload = {
"model": model_name,
"temperature": temperature,
"max_tokens": tokens,
"messages": messages
}
where encode_image() is defined:
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
Currently you need to target OpenAI model gpt-4-vision-preview. Update: As @Michael suggests, it also works with gpt-4o.