I have saved vectors in Weaviate that I want to query using dot product. I'm using the python sdk and I just don't see anyway of specifying this. Does anyone know if this is possible/not possible?
Hi and thanks for your question.
The simple answer as of writing this is "not yet, but soon", but I think I need to elaborate a bit to explain more.
Generally, distance functions in Weaviate are entirely pluggable. Anything that can produce a score can be plugged in. For example, see this folder. In fact, you will even see a file named dot_product.go
in there. This is because internally for calculating the cosine sim, Weaviate will normalize all vectors on read and then just calculate the dot product.
So, if Weaviate can calculate the dot product why can't you select this option? This is because of a past decision to introduce the certainty
field in the API. This field is used to return scores and to limit results by score. The original idea behind the certainty was that we would want a single metric that can produce a number between 0
and 1
to indicate the distance. With cosine sim that's simple, as this is already in the range of -1, 1
, so it's very easy to transform it into a certainty. With an unbounded score such as dot product, this isn't so easy.
Here is a discussion on this topic. Feel free to participate in this discussion. The current favorite option is to deprecate certainty
and expose the raw values as either score
or distance
.
We could easily enable new distance scores, such as dot product before the above mentioned API issue is solved. Possibly as an experimental feature using a feature flag. However, you would not be able to see the resulting scores/distances in the APIs.
I expect the above mentioned issue to be resolved in a couple of weeks as of writing this (April 28, 2022).