I am implementing RAG on a Gemma-2B-it model using langchain's HuggingFaceEmbeddings and ConversationalRetrievalChain.
When running:
chat_history = []
question = "My prompt"
result = qa.invoke({"question": question, "chat_history": chat_history})
I get
276
277 if self.pipeline.task == "text-generation":
--> 278 text = response["generated_text"]
279 elif self.pipeline.task == "text2text-generation":
280 text = response["generated_text"]
KeyError: 'generated_text'
I don't understand why this is happening. It used to work and, today, it just stopped working. I have also tried using qa.run
instead of invoke
but it still raises the same exception.
I have tried changing models, devices but nothing fixes it.
If you're using transformers.pipeline
, then make sure that this return_tensors='pt'
parameter is not passed.