On ChromaDB query.
results = collection.query(
query_texts=["AUSSIE SHAMPOO MIRACULOUSLY SMOOTH 180 ML x 1"],
n_results=3,
include=['documents','distances','embeddings']
I am able to retrieve data from the vector database, but I am interested in obtaining the embeddings of the query_texts ("AUSSIE SHAMPOO MIRACULOUSLY SMOOTH 180 ML x 1") because I plan to add them to the collection (vector database) after completing some processing. Is there any way to do that?
I know I can simply run my embedding function on the query_text, but since Chroma DB query already embed it. It would be more efficient to simply retrieve that.
You can create your embedding function explicitly (instead of relying on the default), e.g. using OpenAI:
from chromadb.utils import embedding_functions
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
api_key=openai_api_key, model_name="text-embedding-ada-002"
)
or sticking to the default:
default_ef = embedding_functions.DefaultEmbeddingFunction()
You'd then typically pass that to the collection like this:
collection = chroma_client.get_collection(
name="my_collection", embedding_function=openai_ef
)
and use your collection normally.
However, to answer your question, you can now embed your query like this:
embedding = openai_ef(["AUSSIE SHAMPOO MIRACULOUSLY SMOOTH 180 ML x 1"])
and pass that embedding to the collection to find similar documents:
results = collection.query(query_vector)