databricks huggingface-transformers langchain large-language-model

Databricks Dolly LLM: empty result when using LangChain with context

I'm following a tutorial on HuggingFace (let's say this one though getting same result with other Dolly models). I am trying to run predictions with context but receiving empty string as an output. I tried different models and text variations.
Regular question answering works as expected. Only breaks when using questions about the context.
What could be the issue here?
context = """George Washington (February 22, 1732[b] – December 14, 1799) was an American military officer, statesman, and Founding Father who served as the first president of the United States from 1789 to 1797."""
llm_context_chain.predict(instruction="When was George Washington president?", context=context)
Out[5]: ''
PS: I'm using GPU cluster on Azure Databricks if that matters

Solution

turns out the model initialization is different for Langchain. I was missing return_full_text = True argument

from transformers import pipeline

generate_text = pipeline(model="databricks/dolly-v2-7b", torch_dtype=torch.bfloat16,
             trust_remote_code=True, device_map="auto", return_full_text=True)

Re-initiating the pipeline this way fixed it