I use llama_index
in Jupyter Notebooks running in Docker container. For data persistence I need to mount the cache
folder from the host to Docker
container. Basically, my question is what is the name of "cache" folder that ServiceContext
from llama_index
uses and how to locate it.
Consider the following code example in Python
:
from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from llama_index.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
service_context = ServiceContext.from_defaults(
llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"
)
This code gives the following output:
config.json: 100%
743/743 [00:00<00:00, 32.2kB/s]
model.safetensors: 100%
133M/133M [00:32<00:00, 3.95MB/s]
tokenizer_config.json: 100%
366/366 [00:00<00:00, 31.0kB/s]
vocab.txt: 100%
232k/232k [00:00<00:00, 1.44MB/s]
tokenizer.json: 100%
711k/711k [00:00<00:00, 2.31MB/s]
special_tokens_map.json: 100%
125/125 [00:00<00:00, 12.9kB/s]
So, ServiceContext
downloaded successfully the above files. But where are these files saved on my side? I can't find it. After Jupyter Notebook restarts all this lost and I have to download it once again.
Do they use HuggingFace under the hood? I checked default .cache
folder which HuggingFace uses, there is no "local:BAAI/bge-small-en-v1.5" model artifacts.
Linux search find -iname "model.safetensors"
gave nothing.
On my Mac it's storing under ~/Library/caches/llama_index/models. It would be nice to use HF_HUB.