Problem instantiating and using GPT4AllEmbeddings

UPDATE: After I'd posted this question I found this issue already was raised on GitHub: https://github.com/nomic-ai/gpt4all/issues/1394 I can either delete this question, or can anyone suggest a workaround? Maybe an alternative way to generate embeddings? Thanks!

I have been trying to build my first application using LangChain, Chroma and a local llm (Ollama in my case). I've been following the (very straightforward) steps from:

https://python.langchain.com/docs/integrations/llms/ollama and also tried https://python.langchain.com/docs/integrations/text_embedding/gpt4all

The problem I'm having is with the step creating embeddings using the GPT4AllEmbeddings model. I can see it is downloaded to ~/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin Although it's size is 45.5 MB which is surprisingly small. But when I try to use it, it fails with this error:

>>> gpt4all_embd = GPT4AllEmbeddings()
100%|████████████████████████████████████████████| 45.5M/45.5M [00:05<00:00, 7.66MiB/s]
Model downloaded at:  /<MY-HOME-PATH>/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
Invalid model file
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for GPT4AllEmbeddings
__root__
  Unable to instantiate model (type=value_error)

I have tried the same steps on different machines and I'm still getting the same error. Googling didn't help.

Solution

Please follow below steps.

In your activated virtual environment

 pip install -U langchain

 pip install gpt4all

Sample code

       from langchain.embeddings import GPT4AllEmbeddings

        gpt4all_embd = GPT4AllEmbeddings()
        query_result = gpt4all_embd.embed_query("This is test doc")
        print(query_result)

Other Option for embeddings through HuggingFace

pip install langchain sentence_transformers

Sample Code

from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings()
text = "This is a test document."
query_result = embeddings.embed_query(text)
print(query_result [:3])