Search code examples
azure-cognitive-serviceschatgpt-apiazure-openai

Azure OpenAI gpt-35-turbo nondeterministic with temperature 0


I have noticed that my deployment of gpt-35-turbo on "Azure AI Studio" is not giving consistent responses to my chat completion prompts even when I set the temperature to 0. The longer the prompt, the more inconsistency I see.

I thought the idea with setting temperature to 0 meant consistent (deterministic) responses (given the same model). Is that not the case?


Solution

  • The temperature parameter controls the randomness of the text generated and while a value of 0 ensures the least random or more deterministic responses, they will not necessarily be exactly the same.

    Ideally, in cases where you would like deterministic responses (say for the same input across devices/users/etc.) then a response cache would help.

    Also, while not recommended in the docs, you could also use the top_p parameter to further control the output response. This controls the set of possible tokens to consider for the next token.

    This discussion on the OpenAI forums goes into how you can use both to your benefit and better control the response from the models.