Search code examples
node.jsopenai-api

Is it possible use "gpt-4-vision-preview" with batching?


I am trying to use the "gpt-4-vision-preview" model with the batching option (since the limits are very low at the moment). This is my messages object (not sure if it's correct but I tried to follow the docs).

            let messages = batch.map(doc => {
            const imageUrl =`someurl`;
            const question = doc.questao;
            const answers = doc.respostas;
            let options = Object.keys(answers).map(key => `${key}: ${answers[key]}`).join('\n');

            return {
                role: "user",
                content: [
                    {
                        type: "text",
                        text: `${question} \n ${options} \n ${questionExplanation}`
                    },

                    {
                        type: "image_url",
                        image_url: {
                            url: imageUrl
                        }
                    }
                ]
            };
        });

And this is how I make the request.

const response = await openai.chat.completions.create({
            model: "gpt-4-vision-preview",
            max_tokens: 4000, // Adjust if needed
            messages: messages
        });

I did not see anywhere in the docs saying if it was possible or not.


Solution

  • In terms of batching, it's only possible to pass multiple images with one text message at the moment.

    Where your messages would look like this

     const PROMPT_MESSAGES = [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "<PROMPT MESSAGE>",
          },
          ...batch.map((doc) => ({
            type: "image_url",
            image_url: {
              url: "<IMAGE_URL>"
              detail: "low",
            },
          })),
        ],
      },
    

    OPTIONAL: by passing detail: "low", you specify the low-res 512px x 512px version of the image to the model which will represent the image with a budget of 65 tokens.