Search code examples
vectorrediscassandranosqlvector-search

NoSQL DB for searching in vector space


I am completely new to NoSQL DBS such as Cassandra, Mongo, Redis, etc. and I want to create this type of a structure :

{
  "item_id": "ABC1",
  "x1": 0.55,
  "x2": -0.29,
  ...
  "x100": 0.17
}

Basically, I have millions of items and 100 floats associated with each of them. My main task is to search for items that are near a given vector of floats (in the vector space of dimension 100), and get for example the top k items or all the items for which distance is less than d.

Is there a NoSQL database that is particularly suited for this kind of task?

Thank you for any hint, Patrick


Solution

  • As far as I know, there are no databases with out-of-the-box support for non-(2|3)D spatial indexes yet, but you can implement your own inside your application layer.

    In general, you would like to have an efficient algorithm for N-dimensional nearest neighbour search like these:

    • KD-Tree with overall O(log N) complexity
    • Geohash

    But both of them are quite tricky to be implemented correctly.