azure azure-cognitive-search azure-openai

Most efficient way to vector search multiple Azure indexes

I am developing application that enables user to chat with files chunked across multiple Azure indexes. Doing multiple Azure vector search calls to each of the indexes is one obvious approach, but has multiple downsides:

I end up getting large number of search results, that throws error "exceeded the maximum allowed number of tokens" when sent to Azure Open AI as context.
If I collect only handful number of top search results from each index (e.g. top 5) I observed missing lots of relevant context causing Open AI to give response like "information is not available in the given context
If I collect search results only from highest scoring Index, actual relevant text might remain inside another index that has scored little higher than highest scorer index
Azure search score doesn't make any sense as I observed that anything greater than 4 could contain relevant info!
Most prominent downside is, searching through multiple indexes takes lots of time. This is not good for a typical chat application where user would expect answer within a second or so. Searching through multiple indexes takes lots many seconds and then compilation of answer by Open AI out of all search results takes another bunch of seconds. All in all it is not good chat experience.

What could be best possible solution to this? Is it possible to search multiple indexes in one call anyhow? Is there any API that would club multiple indexes in one index (so that, that one index can be used for chatting).

Solution

You can use ensemble_retriever from langchain to combine multiple index, then build chain to query on AzureOpenAI models.

First install required libraries.

%pip install --upgrade --quiet langchain-community  
%pip install --upgrade --quiet langchain-openai  
%pip install --upgrade --quiet azure-search-documents>=11.4  
%pip install --upgrade --quiet azure-identity

Below are the index i am having.

enter image description here

You retrieve the index name and use below code.

from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import AzureAISearchRetriever

from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import AzureChatOpenAI


indexes = ["azureblob-index-2","azureblob-index-2","azureblob-index-2"]
retrievers = []

for index in indexes:
    retriever = AzureAISearchRetriever(
        content_key="content", top_k=1, index_name=index,
        service_name="jgsai",api_key="<Your_ai_search_key>"
        )
    retrievers.append(retriever)

ensemble_retriever = EnsembleRetriever(
    retrievers=retrievers
)



prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the context provided.

Context: {context}

Question: {question}"""
)

llm = AzureChatOpenAI(azure_deployment="<deployment_name>",
                api_key="Your_azure_openai_key",
                api_version="2024-05-01-preview",
                azure_endpoint="<endpoint>")




chain = (
    {"context": ensemble_retriever , "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

chain.invoke("Give me solution for the concurrent operations on delta table.")

Output:

enter image description here

Here based on my document content it gave output.

Refer this documentation for more about chain with AzureAISearchRetriever.

Next, you can get the documents related to query using retriever invoke method.

ensemble_retriever.invoke("Give me solution for the concurrent operations on delta table.")

Output:

enter image description here

and this document belongs to my second index

enter image description here