I'm building a FastAPI endpoint which should stream the ChatCompletion of GPT3.5 from the openAI python library. Here is my code :
@app.post("/ai_re/")
async def read_item(request: Request):
base_prompt = "This is a prompt"
sources = []
response = await asyncify(openai.ChatCompletion.create)(
model = 'gpt-3.5-turbo',
messages = [{"role": "system", "content": base_prompt.strip()}],
max_tokens = 550,
temperature = 0.28,
stream = True,
n = 1
)
async def event_generator():
for event in response:
event_text = event.choices[0].delta.content if "content" in event.choices[0].delta else ""
event_data = {
"texte": event_text,
"source": sources
}
yield f"data: {json.dumps(event_data)}\n\n"
return StreamingResponse(event_generator(), media_type="text/event-stream")
I use Asyncify to make the request async — as a simple post endpoint (no sse) this works well and doesn't block the main thread.
But somehow in this configuration, the streaming works, but all the other endpoints are blocked until this request succeeds.
I tried :
coroutine object is not an iterator
Somehow removing the async
before the event generator worked out.