python stream artificial-intelligence openai-api llama-index

How to return the result of llama_index's index.query(query, streaming=True)?

I am trying to return the result of llama_index's index.query(query, streaming=True).

But not sure how to do it.

This one obviously doesn't work.

index = GPTSimpleVectorIndex.load_from_disk(index_file)

return index.query(query, streaming=True)

Error message: TypeError: cannot pickle 'generator' object.

This one neither.

def stream_chat(query: str, index):
    for chunk in index.query(query, streaming=True):
        print(chunk)
        content = chunk["response"]
        if content is not None:
            yield content

# in another function
index = GPTSimpleVectorIndex.load_from_disk(index_file)
return StreamingResponse(stream_chat(query, index), media_type="text/html")

Error message: TypeError: 'StreamingResponse' object is not iterable.

Thanks!

Solution

Ok I figured it out.

The answer is

return StreamingResponse(index.query(query, streaming=True).response_gen)