Search code examples
langchainhuggingfacelarge-language-modelllama-indexvector-database

I don't understand how the prompts work in llama_index


I have been trying to query a pdf file in my local directory using LLM, I have downloaded the LLM model I'm using in my local system (GPT4All-13B-snoozy.ggmlv3.q4_0.bin) and trying to use langchain and hugging face's instructor-large model for embedding purpose, I was able to set the service_context and then building index but I'm not able to query , I keeping getting this error regarding prompt..

ValueError: Argument prompt is expected to be a string. Instead found <class 'llama_index.prompts.base.Prompt'>. If you want to run the LLM on multiple prompts, use generate instead.

I'm just starting to learn how to use LLM, hope the community helps me....

error message part1

error message part2

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from InstructorEmbedding import INSTRUCTOR
from llama_index import PromptHelper, ServiceContext
from llama_index import LangchainEmbedding
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import OpenLLM
# from langchain.chat_models.human import HumanInputChatModel
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

documents = SimpleDirectoryReader(r'C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large').load_data()

model_id = 'hkunlp/instructor-large'

model_path = "..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin"

callbacks = [StreamingStdOutCallbackHandler()]

# Verbose is required to pass to the callback manager
llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)

embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))

# define prompt helper
# set maximum input size
max_input_size = 4096
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 0.2

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

service_context = ServiceContext.from_defaults(chunk_size= 1024, llm_predictor=llm, prompt_helper=prompt_helper, embed_model=embed_model)

index = VectorStoreIndex.from_documents(documents, service_context= service_context)

query_engine = index.as_query_engine()

response = query_engine.query("What is apple's finnacial situation")
print(response)

I have been going through, the source code of the library as the error message guides but I couldn't find the problem😓


Solution

  • The code you have written here is a little old/erroneous. But the main error is the service context setup with llm_predictor=llm. You can just pass the llm in directly as a kwarg.

    Using the latest version (v0.7.22 at the time of writing) I would re-write your service context like so:

    service_context = ServiceContext.from_defaults(
        chunk_size= 1024, 
        llm=llm, # this is updated
        prompt_helper=prompt_helper, 
        embed_model=embed_model
    )
    

    Source: https://gpt-index.readthedocs.io/en/stable/core_modules/model_modules/llms/usage_custom.html#example-changing-the-underlying-llm

    For reference, if you pass in an llm from langchain like this, the service context detects this and wraps it with our langchain wrapper for you:

    from llama_index.llms import LangChainLLM
    
    llm = LangChainLLM(langchain_llm)
    

    This is useful to know, since other parts of llama-index (agents, chat engines, etc.) my expect an LLM object as the input, and won't wrap it for you.