Langchain, Huggingface: Can't evaluate model with two different inputs

I'm evaluating a LLM on Huggingface using Langchain and Python using this code:


from langchain import HuggingFaceHub, LLMChain
import os

hugging_face_write = "MY_KEY"
os.environ['HUGGINGFACEHUB_API_TOKEN'] = hugging_face_write

from langchain import PromptTemplate, HuggingFaceHub, LLMChain

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64}))

question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"


I get the error

ValueError                                Traceback (most recent call last)
g:\Meine Ablage\python\lang_chain\ in line 1
----> 19 print(

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\, in, *args, **kwargs)
    211     if len(args) != 1:
    212         raise ValueError("`run` supports only one positional argument.")
--> 213     return self(args[0])[self.output_keys[0]]
    215 if kwargs and not args:
    216     return self(kwargs)[self.output_keys[0]]

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\, in Chain.__call__(self, inputs, return_only_outputs)
    114 except (KeyboardInterrupt, Exception) as e:
    115     self.callback_manager.on_chain_error(e, verbose=self.verbose)
--> 116     raise e
    117 self.callback_manager.on_chain_end(outputs, verbose=self.verbose)
    118 return self.prep_outputs(inputs, outputs, return_only_outputs)

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\, in Chain.__call__(self, inputs, return_only_outputs)
    107 self.callback_manager.on_chain_start(
    108     {"name": self.__class__.__name__},
    109     inputs,
    110     verbose=self.verbose,
    111 )
    106 if self.client.task == "text-generation":
    107     # Text generation return includes the starter text.
    108     text = response[0]["generated_text"][len(prompt) :]

ValueError: Error raised by inference API: Model google/flan-t5-xl time out

What am I doing incorrectly? I'm a newbie...

Many thanks in advance, best regards from Paris,


I ran my python script from above. After some waiting the shown error is given.


  • You need to upgrade your hugging face account to Pro version to host the large model for inference.

    "google/flan-t5-base" works for the free account.