Search code examples
pythonimporterrorlangchainhuggingface

ImportError: partition_docx is not available using Langchain on HuggingFace


I am using the DirectoryLoader with Langchain on HuggingFace (Gradio SDK) like so from my folder named "data":

from langchain.document_loaders import DirectoryLoader  
  
loader = DirectoryLoader('./data/')  
raw_documents = loader.load() 

but get the following error:

ImportError: partition_docx is not available. Install the docx dependencies with pip install "unstructured[docx]"

Does anyone have any insight as to why this error is being given? Nothing pops up for me on a web search for this error.

Thanks in advance! Apologies if more context is needed, just getting into python and I am very novice.


Solution

  • Op Leanna created the Gradio space and then noticed the import error. so here are all the details.

    For debugging, create Gradio space In Hugging Face,

    1. Create Gradio space enter image description here
    1. To manage dependencies, create requirements.txt and add the modules that are needed:
    langchain
    unstructured
    unstructured[docx]
    

    Files can be reviewed at : SimpleappGradio