Search code examples
pythonlangchainchromadb

How to prevent VectorstoreIndexCreator to load presistent dir to memory


Using LangChain LLM/Chroma with FastAPI as Back-end for custom chatbot.

First, I'm calling API to create persistent folder with file(s) already existed in the asset folder as followed

_ = VectorstoreIndexCreator(
    vectorstore_kwargs={"persist_directory": self.__getPersistPath()},
    embedding=self.__getVectorEmbedding(),
).from_loaders([DirectoryLoader(path=self.__getAssetPath())])

Then I called API to delete a file in my asset folder with this main logic

ChatbotHelper.removeDirectory(pathPersist)
os.mkdir(pathPersist)
filePath = pathAsset + f"\\{file_name}"
os.remove(filePath)
if len(fnmatch.filter(os.listdir(self.__getAssetPath()), "*.*")) > 0:
    _ = self.createVectorStore()

This is when the error [WinError 32] The process cannot access the file because it is being used by another process appear

In my ChatbotHelper class

@staticmethod
def removeDirectory(path):
    if os.path.exists(path):
        for root, dirs, files in os.walk(path, False):
            for name in files:
                file_path = os.path.join(root, name)
                os.remove(file_path)
            for name in dirs:
                dir_path = os.path.join(root, name)
                os.rmdir(dir_path)
        os.rmdir(path)

I want to Langchain/Chroma not automatically added my persistent data into memory or some work-around solution


Solution

  • Resolving this issue by define reload_includes params for uvicorn.run()

    Usage:

    reload_includes = ["*.py", "*pdf", "*txt"]
    uvicorn.run(app_str, reload=True, reload_includes=reload_includes)