I am generating chromba db which has vector embeddings for pdf different documents and I want to store them to avoid re computation every time for different inputs. Pickling and Json serialization does not seem to work for chroma object, importing from another file also makes the embedding script run again.
You are able to pass a persist_directory
when using ChromaDB with Langchain
persist_directory = 'db'
embedding = OpenAIEmbeddings()
vectordb = Chroma.from_documents(documents=texts, embedding=embedding, persist_directory=persist_directory)
This will store the embedding results inside a folder named db
The next time you need to access the db simply load it from memory like so
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)