Search code examples
indexingcouchdb-2.0couchdb-mango

Couchdb, Mango queries and Indexes


I use Mango queries with Couchdb when users want to search the database. They can search by multiple fields in the document.

To simplify I have a document with three fields all of which can be searched to find the document.

FieldOne: cow
FieldTwo: pig
FieldThree: dog

When creating a mango index for the query what fields should I use? Should I create and index with multiple fields? Does the order matter?

There are also three different document types (A, B C) defined in a field. I only need to search one, is it possible to exclude the other types from the index to make it faster?

Field_Type: A

It would make sense to me to be able to run an index against a view to only search through the documents I am interested in. Is this possible?

Example indexes

One index on field known to appear in the query

  {
       "index": {
          "fields": [
             "FieldOne"
          ]
       },
       "name": "foo-json-index",
       "type": "json"
    }

Multiple indexes, not sure if used or not?

 {
       "index": {
          "fields": [
             "FieldOne",
             "FieldTwo",
             "FieldThree"
          ]
       },
       "name": "foo-json-index",
       "type": "json"
    }

Or multiple indexes to choose the correct one when building the query?

which is the correct approach to get the fastest search results?


Solution

  • I don't think you can easily solve this in the general case - you'd need a specific selector (or set of selectors) to optimise for. However, there are 3 aspects to consider when creating an index that may help:

    1. When you create a Mango index, only documents containing all of the indexed fields are included in the index. The second index example you cite, containing all 3 fields, is perfectly fine so long as you are always going to specify all 3 fields in the query selector.

    2. At query time, an index can only be used if all it's indexed fields are required to exist according to the selector. For example, if your index contains fields A and B but you only query for A, we can't use the index because it won't include documents that contain A but not B.

    3. Order of fields in the index matters. An index can only be used if it can find a contiguous range of values for the indexed fields - the same as a compound key for a _view. For example, let's say you have an index on fields A and B and the documents [{A:1,B:1},{A:1,B:2},{A:2,B:1},{A:2,B:2}]. The index will look like: [[1,1],[1,2],[2,1],[2,2]]. If your query is then A >= 1 AND B == 2, the matching documents ([1,2] and [2,2]) the only range that covers all matching documents is [1,2],[2,1],[2,2]] - the value [2,1] would need to be filtered out in memory.

    You can see what index is used to fulfil a query using the _explain endpoint, and that should give you some clue as to how selective the index is able to be for a given selector.