Search code examples
pythonapache-sparkdatabrickslangchainhuggingface

LangChain + Hugging Face -> HuggingFacePIpeline Error


I'm trying to do the following simple code:

from transformers import pipeline
import langchain
from langchain.llms import HuggingFacePipeline

model_name = "bert-base-uncased"
task = "question-answering"

hf_pipeline = pipeline(task, model=model_name)

langchain_pipeline = HuggingFacePipeline(hf_pipeline)

I get the following error:

  • ERROR: TypeError: Serializable.__init__() takes 1 positional argument but 2 were given
  • LINE: langchain_pipeline = HuggingFacePipeline(hf_pipeline)

Haven't found anything online that actually helped me here


I'm using Databricks with the following cluster:

  • Runtime: 12.2 LTS ML (includes Apache Spark 3.3.2, Scala 2.12)
  • Node type: Standard_DS5_v2 56 GB Memory, 16 Cores
  • Libraries: cluster2libraries

Solution

  • Here's the code example

    # from transformers import pipeline
    # import langchain
    from langchain_community.llms import HuggingFacePipeline
    
    model_name = "bert-base-uncased"
    # task = "question-answering"
    task = "text-generation"
    
    # hf_pipeline = pipeline(task, model=model_name)
    
    langchain_pipeline = HuggingFacePipeline.from_model_id(
        model_name,
        task,
    )
    

    And only ('text2text-generation', 'text-generation', 'summarization') are supported

    HuggingFacePipeline doc for reference > https://python.langchain.com/docs/integrations/llms/huggingface_pipelines