Below is the code I'm using currently. I'm comparing vector consisting of 768 floats against 50k others, and it takes about 800ms. I'm assuming that there's a much faster implementation, either in C# or perhaps some package that I can use that does the calculation natively, but I'm having trouble finding it. Thanks!
// USAGE:
// vectors is IEnumerable<float[768]>
// vector is float[768]
vectors.DotProductSum(vector) * 100)
public static float DotProductSum(this IEnumerable<float> values, IEnumerable<float> other)
{
return values.Zip(other, (d1, d2) => d1 * d2).Sum();
}
I found a very fast solution, Faiss, which in my testing was able to query 10s of thousands of 2048-float vectors in <5ms. I'm consuming it from .NET, so used the FaissMask wrapper library. You need a number of native dependencies to do so, which you can get by building the faiss repo. I haven't found a package with the dependencies included. Specifically, I needed:
libgcc_s_seh-1.dll
libgfortran-3.dll
libopenblas.dll
libquadmath-0.dll
faiss.dll
faiss_c.dll
After that, the code is very straightforward:
using var index = new FaissMask.IndexFlat((int)embeddingSize, MetricType.MetricInnerProduct);
index.Add(vectors);
var queryResults = index.Search(queryVector, 10);