Search code examples
openai-apiazure-openai

AsyncAzureOpenAI Client from openai python SDK support for Custom Fast API Endpoint


I am working on a project where the team has developed custom LLM asynchronous API endpoints using FastAPI and AzureOpenAI and the application uses a B2B token for authenticating user requests. But now we want to test those endpoints using AsyncAzureOpenAI client from openai sdk. Is it possible to pass the custom endpoint at azure_endpoint or base_url argument? If yes, then I need help with an example. Any help will be appreciated.

LATEST_URL = "http://localhost:8080" 
model_id = "gpt-4"
B2B_token = "xxxx"

client = AsyncAzureOpenAI(
    azure_endpoint=f"{LATEST_URL}/v1/genai/models/deployments/{model_id}/chat/completions",
    max_retries=0,
    api_key=B2B_token)

But when requesting through client

response = await client.chat.completions.create(
        messages=[{"role": "user", "content": "Hi"}],
        model= model_id,
        max_tokens=50,
        temperature=0.7,
        timeout=30
    )

I see in the FastAPI server log, it is throwing 404 error

INFO:     127.0.0.1:52318 - "POST /v1/genai/models/deployments/gpt-4/chat/completions/openai/chat/completions?api-version=2024-06-01 HTTP/1.1" 404 Not Found

and it seems, the endpoint is not been able to find as it is appending api-version as a query in the api route, which is not expected.


Solution

  • If you look at the error log for the 404, you can see there are duplicate path in the url when calling to azure openai endpoint.

    Error log url:

    /v1/genai/models/deployments/gpt-4/chat/completions/openai/chat/completions?api-version=2024-06-01
    

    Typical url from Azure endpoint:

    /openai/deployments/gpt-4o/chat/completions?api-version=2024-06-01
    

    Try to update the param value in the client from:

    azure_endpoint=f"{LATEST_URL}/v1/genai/models/deployments/{model_id}/chat/completions",
    

    to just:

    azure_endpoint=f"{LATEST_URL}"
    

    my suggestion is to get your full endpoint url from azure's model deployment tab, then you can compare it with your fastapi's log. usually, it is adjusting your local params' value to match them: model_id, endpoint url and version number.