Search code examples
langchainvector-databasechromadb

Do LOWER results from Chroma's similarity_with_score mean HIGHER Accuracy?


I have a quick question: I'm using the Chroma vector store with LangChain.

And I brought up a simple docsearch with Chroma.from_texts. I was initially very confused because i thought the similarity_score_with_score would be higher for queries that are close to answers, but it seems from my testing the opposite is true. Is this becasue it's returning the 'distance' between the two vectors when it searches? I was looking at docs but it only says "List of Documents most similar to the query and score for each" but doesnt explain what 'score' is

Doc reference https://python.langchain.com/en/latest/reference/modules/vectorstores.html?highlight=similarity_search#langchain.vectorstores.Annoy.similarity_search_with_score Can also give more info on the (small to start) dataset im using and queries i tested with.


Solution

  • as you said, it's returning the 'distance' between the two vectors when it searches. those vectors are similar will be placed closer to each other in vector space. lower distance means documents are more similar to each other.