Search code examples
azure-cognitive-searchazure-openai

Azure OpenAI REST API Completions Extensions - Is more than one data source supported?


The AzureOpenAI REST API documentation (https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#completions-extensions) and the API Swagger (https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-08-01-preview/inference.json) suggest that the completions extensions endpoint can support more than one data source as it has a type of array and the description is "The data sources to be used for the Azure OpenAI on your data feature".

I've tested out adding 2 Azure Cognitive Search Indexes and I receive the error: "Validation error at #/dataSources: List should have at most 1 item after validation, not 2". Does this endpoint not currently support more than one data source? Do the data sources need to be different types (i.e. AzureCognitiveSearch and AzureCosmosDB)?


Solution

  • I had this same question. I asked my Microsoft contacts, and the answer is No, the API does not support multiple data sources (I'm not sure if it's on their roadmap or not).

    You will have to use LLM orchestration tools like Langchain, semantic model, or function calling with Prompt flow etc.

    You will have to build your prompt in such a way that your LLM uses the right data source based on the user's questions or use Open AI's function calling feature.

    So Azure Prompt Flow + Function calling would be how you would build a solution on the Azure platform leveraging multiple data sources.  You can also use LangChain in conjunction with Prompt Flow.

    Here is one interesting example matching a use case where one of the data sources is Bing search and other are different knowledge base stores.  This one uses Langchain orchestration in python code:

    Azure-Cognitive-Search-Azure-OpenAI-Accelerator GitHub

    For the prompt flow + function calling approach, you can review the example from the prompt flow gallery.  In 'prompt flow', click 'create new flow' and you will get access to a gallery with pre-built examples.  See Use GPT Function Calling.

    Alternatively, if you're using Power Virtual Agents (now Copilot Studio), you can use generative answers to leverage multiple data sources.

    Generative answers with Search and summarize - Information Sources

    With Copilot Studio you can create copilots with Generative Answers using public websites, SharePoint sources, documents and Azure OpenAI Service. Then you have the option to deploy your copilot to different channels listed here.