APIConnectionError: Connection error exception persisting confluence data to chromadb

I could successfully load and process my confluence data with scale like:

However when I tried to persist it in vectorDB with something like:

vectordb = Chroma.from_documents(
    documents=splits,
    embedding=embedding,
    persist_directory=persist_directory
)

it ran over a couple of hours on my modest laptop eventually throwing an exception of APIConnectionError: Connection error.

Is it some kind of timeout? If so, how do I get around it?

Any ideas?

Stackoveflow does not allow me to share the complete call stack here since it is very large but I have posted it at: https://community.deeplearning.ai/t/apiconnectionerror-connection-error-exception-persisting-confluence-data-to-vectordb/544670 in case it helps to narrow down the issue.

Solution

This was actually due to low chunk size. When I increased it to 1000 it works fine. I had overlooked the number copy pasting the code from a sample!