With the
docs = (db.similarity_search(query='some query here'))
method to output single or multiple documents of the deeplake vectorstore. Is there a method to output all documents?
Because my documents are structured like this:
page_content='256 128 256zM208 160c-8,836 0-16-...
384C234.5 384 256 362.5 256 336C256 309.5 234.5 288 208'
metadata={'source':'chatbot/app/solid.min.js','file_name':'solid.min.js'}
And I would genre all documents whose metadata.file_name corresponds to a particular file. Unfortunately I can't find any recordings for this and that's why I'm asking here for experience.
you can query and apply filter on your metadata
def query_datalake(db, query, subject):
filter={"metadata": {"source": f"output\\{subject}.txt"}}
#Distance function L2 for Euclidean, L1 for Nuclear, Max l-infinity distance, cos for cosine similarity, dot for dot product
docs = db.similarity_search(query, filter=filter, distance_metric="cos", k=10)
return docs