Langchain, Huggingface: Can't evaluate model with two different inputs

I'm evaluating a LLM on Huggingface using Langchain and Python using this code:

# https://github.com/hwchase17/langchain/blob/0e763677e4c334af80f2b542cb269f3786d8403f/docs/modules/models/llms/integrations/huggingface_hub.ipynb

from langchain import HuggingFaceHub, LLMChain
import os

hugging_face_write = "MY_KEY"
os.environ['HUGGINGFACEHUB_API_TOKEN'] = hugging_face_write

from langchain import PromptTemplate, HuggingFaceHub, LLMChain

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64}))

question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"

print(llm_chain.run(question))

I get the error

ValueError                                Traceback (most recent call last)
g:\Meine Ablage\python\lang_chain\langchain_huggingface_example.py in line 1
----> 19 print(llm_chain.run(question))

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\base.py:213, in Chain.run(self, *args, **kwargs)
    211     if len(args) != 1:
    212         raise ValueError("`run` supports only one positional argument.")
--> 213     return self(args[0])[self.output_keys[0]]
    215 if kwargs and not args:
    216     return self(kwargs)[self.output_keys[0]]

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\base.py:116, in Chain.__call__(self, inputs, return_only_outputs)
    114 except (KeyboardInterrupt, Exception) as e:
    115     self.callback_manager.on_chain_error(e, verbose=self.verbose)
--> 116     raise e
    117 self.callback_manager.on_chain_end(outputs, verbose=self.verbose)
    118 return self.prep_outputs(inputs, outputs, return_only_outputs)

File c:\Users\johan\.conda\envs\lang_chain\Lib\site-packages\langchain\chains\base.py:113, in Chain.__call__(self, inputs, return_only_outputs)
    107 self.callback_manager.on_chain_start(
    108     {"name": self.__class__.__name__},
    109     inputs,
    110     verbose=self.verbose,
    111 )
...
    106 if self.client.task == "text-generation":
    107     # Text generation return includes the starter text.
    108     text = response[0]["generated_text"][len(prompt) :]

ValueError: Error raised by inference API: Model google/flan-t5-xl time out

What am I doing incorrectly? I'm a newbie...

Many thanks in advance, best regards from Paris,

Jennie

I ran my python script from above. After some waiting the shown error is given.

Solution

You need to upgrade your hugging face account to Pro version to host the large model for inference.

"google/flan-t5-base" works for the free account.