openai-api langchain chatgpt-api py-langchain

How to limit bot to answer only documentation related questions

I am playing with langchain/openai/faiss to create chatbot that reads all PDFs, and can answer based on what it learned from them.

What I want to know is there a way to limit answers to knowledge only from documentation, if answer is not in docs bot should respond I do not know or something like that.

Here is the code:

 llm = ChatOpenAI(temperature=0, max_tokens=1000,
                         model_name="gpt-3.5-turbo-16k")
    memory = ConversationBufferMemory(memory_key="chat_history")
    chat = ConversationalRetrievalChain.from_llm(
        llm=llm,retriever=vector_store.as_retriever(),memory=memory)
    
    if "messages" not in st.session_state:
        st.session_state.messages = []

    if not st.session_state.messages:
        welcome_message = {"role": "assistant",
                           "content": "Hello, how can i help?"}
        st.session_state.messages.append(welcome_message)

    for message in st.session_state.messages:
        with st.chat_message(message["role"]):
            st.markdown(message["content"])


    if prompt := st.chat_input("State your question"):
        st.session_state.messages.append({"role": "user", "content": prompt})
        with st.chat_message("user"):
            st.markdown(prompt)
        result = chat({"question": prompt, "chat_history": [
                    (message["role"], message["content"]) for message in st.session_state.messages]})

        with st.chat_message("assistant"):
            full_response = result["answer"]
            st.markdown(full_response)

        st.session_state.messages.append(
            {"role": "assistant", "content": full_response})

Solution

Yes, there is. But first, keep in mind that the chatbot doesn't "learn" anything from the PDF files' texts you load into the vectorstore; it compares the user's question embeddings to find the most similar matches with your texts' embeddings in your vectorstore.

To get you started with a Document QA chatbot with Conversationcapabilities, I'd recommend you try this Agent:

from langchain.chat_models import ChatOpenAI
from langchain.agents.openai_functions_agent.agent_token_buffer_memory import AgentTokenBufferMemory
from langchain.agents.openai_functions_agent.base import OpenAIFunctionsAgent
from langchain.schema.messages import SystemMessage
from langchain.prompts import MessagesPlaceholder
from langchain.agents import AgentExecutor

retriever = vector_store.as_retriever()

tool = create_retriever_tool(
    retriever, 
    "search_document_content",
    "Useful for searching and querying the relevant content."
    )

tools = [tool]

memory_key = "history"

memory = AgentTokenBufferMemory(memory_key = memory_key, llm = llm)

system_message = SystemMessage(
        content=("""Use only the tools provided to look for context to answer 
        the user's question. If you don't know the answers to the user 
        questions, truthfully say you don't know. Don't attempt to make up 
        answers or hallucinate.""")
        )

prompt = OpenAIFunctionsAgent.create_prompt(
        system_message=system_message,
        extra_prompt_messages = [MessagesPlaceholder(variable_name = memory_key)]
        )

agent = OpenAIFunctionsAgent(llm = llm, tools = tools, prompt = prompt)

agent_executor = AgentExecutor(
        agent = agent,
        tools = tools,
        memory = memory,
        verbose = True,
        return_intermediate_steps = True
        )

result = agent_executor({"input": "<This is your PDF-related question>"})

Look here for more information on a simpler approach for QA using a retriever.