python huggingface-transformers langchain large-language-model

ConversationalRetrievalChain raising KeyError

I am implementing RAG on a Gemma-2B-it model using langchain's HuggingFaceEmbeddings and ConversationalRetrievalChain.

When running:

chat_history = []
question = "My prompt"
result = qa.invoke({"question": question, "chat_history": chat_history})

I get

    276 
    277                 if self.pipeline.task == "text-generation":
--> 278                     text = response["generated_text"]
    279                 elif self.pipeline.task == "text2text-generation":
    280                     text = response["generated_text"]

KeyError: 'generated_text'

I don't understand why this is happening. It used to work and, today, it just stopped working. I have also tried using qa.run instead of invoke but it still raises the same exception.

I have tried changing models, devices but nothing fixes it.

Solution

If you're using transformers.pipeline, then make sure that this return_tensors='pt' parameter is not passed.