Search code examples
query-optimizationcouchbasesql++

Optimizing seemingly simple couchbase query for "items whose children satisfy"


I'm developing a system to store our translations using couchbase.

I have about 15,000 entries in my bucket that look like this:

{
  "classifications": [
    {
      "documentPath": "Test Vendor/Test Project/Ordered",
      "position": 1
    }
  ],
  "id": "message-Test Vendor/Test Project:first",
  "key": "first",
  "projectId": "project-Test Vendor/Test Project",
  "translations": {
    "en-US": [
      {
        "default": {
          "owner": "414d6352-c26b-493e-835e-3f0cf37f1f3c",
          "text": "first"
        }
      }
    ]
  },
  "type": "message",
  "vendorId": "vendor-Test Vendor"
},

And I want, as an example, to find all messages that are classified with a "documentPath" of "Test Vendor/Test Project/Ordered".

I use this query:

SELECT message.*
FROM couchlate message UNNEST message.classifications classification
WHERE classification.documentPath = "Test Vendor/Test Project/Ordered"
      AND message.type="message"
ORDER BY classification.position

But I'm very surprised that the query takes 2 seconds to execute!

Looking at the query execution plan, it seems that couchbase is looping over all the messages and then filtering on "documentPath".

I'd like it to first filter on "documentPath" (because there are in reality only 2 documentPaths matching my query) and then find the messages.

I've tried to create an index on "classifications" but it did not change anything.

Is there something wrong with my index setup, or should I structure my data differently to get fast results?

I'm using couchbase 4.5 beta if that matters.


Solution

  • Your query filters on the documentPath field, so an index on classifications doesn't actually help. You need to create an array index on the documentPath field itself using the new array index syntax on Couchbase 4.5:

    CREATE INDEX ix_documentPath ON myBucket ( DISTINCT ARRAY c.documentPath FOR c IN classifications END ) ;
    

    Then you can query on documentPath with a query like this:

    SELECT * FROM myBucket WHERE ANY c IN classifications SATISFIES c.documentPath = "your path here" END ;
    

    Add EXPLAIN to the start of the query to see the execution plan and confirm that it is indeed using the index ix_documentPath.

    More details and examples here: http://developer.couchbase.com/documentation/server/4.5-dp/indexing-arrays.html