I have the following code:
chat_history = []
embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(chunks, embeddings)
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0.1), db.as_retriever())
result = qa({"question": "What is stack overflow", "chat_history": chat_history})
The code creates embeddings, creates a FAISS in-memory vector db with some text that I have in chunks
array, then it creates a ConversationalRetrievalChain, followed by asking a question.
Based on what I understand from ConversationalRetrievalChain, when asked a question, it will first query the FAISS vector db, then, if it can't find anything matching, it will go to OpenAI to answer that question. (is my understanding correct?)
How can I detect if it actually called OpenAI to get the answer or it was able to get it from the in-memory vector DB? The result
object contains question
, chat_history
and answer
properties and nothing else.
"Based on what I understand from ConversationalRetrievalChain, when asked a question, it will first query the FAISS vector db, then, if it can't find anything matching, it will go to OpenAI to answer that question."
This part is not correct. Each time ConversationalRetrievalChain receives your query in conversation, it will rephrase the question, and retrieves documents from your vector store(It is FAISS in your case), and returns answers generated by LLMs(It is OpenAI in your case). Meaning that ConversationalRetrievalChain is the conversation version of RetrievalQA.