I'm currently working with LangChain and using the TextLoader
class to load text data from a file and utilize it within a Vectorstore index. However, I've noticed that response times to my queries are increasing as my text file grows larger. To enhance performance, I'm wondering if there are ways to expedite the response times.
Sample Code:
python
import os
import time
from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.chat_models import ChatOpenAI
import constants
os.environ["OPENAI_API_KEY"] = constants.OPENAI_API_KEY
loader = TextLoader("all_content.txt", encoding="utf-8")
# Record the start time
start_time = time.time()
index = VectorstoreIndexCreator().from_loaders([loader])
query = "My question?"
response = index.query(query).encode('utf-8').decode('utf-8')
print(response)
# Record the end time
end_time = time.time()
# Calculate the execution time
execution_time = end_time - start_time
print(f"Execution time: {execution_time:.4f} seconds")
My Questions:
Are there ways to optimize response times when using TextLoader
?
Can caching be effectively employed to reduce response times? If so, how can I integrate it into my current implementation?
Are there alternative approaches or techniques I can employ to effectively shorten response times?
I've noticed that response times increase as my text file grows, and I'm actively seeking ways to enhance the performance of my queries. Any advice or suggestions for optimizing this implementation would be greatly appreciated. Thank you in advance!
read langchain docs and tried momento cache
You can try multiple options.
RetrievalQA
ContextualCompressionRetriever
You can also try splitting the data in chucks (you can search for text splitter in langchain). Here you can chuck the data to improve the retrieval.
import os
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import SomeVectorStoreDB
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.retrievers import ContextualCompressionRetriever
# text to write to a local file
# taken from https://www.theverge.com/2023/3/14/23639313/google-ai-language-model-palm-api-challenge-openai
text = """Google opens up its AI language model PaLM to challenge OpenAI and GPT-3
Google is offering developers access to one of its most advanced AI language models: PaLM.
The search giant is launching an API for PaLM alongside a number of AI enterprise tools
it says will help businesses “generate text, images, code, videos, audio, and more from
simple natural language prompts.”
PaLM is a large language model, or LLM, similar to the GPT series created by OpenAI or
Meta’s LLaMA family of models. Google first announced PaLM in April 2022. Like other LLMs,
PaLM is a flexible system that can potentially carry out all sorts of text generation and
editing tasks. You could train PaLM to be a conversational chatbot like ChatGPT, for
example, or you could use it for tasks like summarizing text or even writing code.
(It’s similar to features Google also announced today for its Workspace apps like Google
Docs and Gmail.)
"""
curr_dir = os.path.dirname(__file__)
with open(curr_dir+"/myfile.txt", "w") as file:
file.write(text)
loader = TextLoader(curr_dir+"/myfile.txt")
docs_from_file = loader.load()
print(len(docs_from_file))
text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(docs_from_file)
print(len(docs))
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
db = SomeVectorStoreDB
db.add_documents(docs)
retriever = db.as_retriever()
qa_chain = RetrievalQA.from_chain_type(
llm=OpenAI(model_name="text-davinci-003"),
chain_type="stuff",
retriever=retriever
)
query = "How Google plan to challenge OpenAI ?"
response = qa_chain.run(query)
print(response)
llm = OpenAI(model_name="text-davinci-003", temperature=0)
compressor = LLMChainExtractor.from_llm(llm=llm)
compression_retriver = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=retriever
)
retrieved_docs = compression_retriver.get_relevant_documents(
"How Google Plans to Challenge AI ?"
)
print(retrieved_docs[0].page_content)