Search code examples
countcouchbasesql++

What are the factors that can make the item count differ from the item id count in couchbase?


On a 3-nodes Couchbase Community Edition 5.0.1 build 5003 cluster, couchbase indicates that it contains 12268503 items. However, when counting the ids, the result is 6132875.

What are the factors that can make the item count differ from the item id count in couchbase?

More precisely, when the following N1QL query is executed on a buckets - say Product

SELECT count(1) FROM Product

It gives

12268503

While when the count is made on the item ids

SELECT count(META(Product).id) FROM Product

It returns:

6132875

That is, the number of ids is less than 50% of the number of items.

Also, they was no operation (0 ops/s) on the bucket for several hours, which excludes the possibility of the primary index not catching up due to a traffic peak.

I pored through the couchbase blog & doc without finding any clues as for this count difference. Any pointer is much appreciated.


Solution

  • If the query has no predicate/no join and projection has single expression count(*), count(constant) the query gets the results from bucket stats and provide the info (takes sub milli seconds).

       SELECT count(*) FROM Product;
       SELECT count(1) FROM Product;
    

    The following is almost similar but COUNT argument is expression so it has to use index and do aggregation (As in this case document key which unique and must be string, optimizer should have considered as previous approach, As of now no optimization)

    SELECT count(META(Product).id) FROM Product
    

    In second case it uses index, Your index might have pending items and not caught up. Try use scan_consistency. Check the index stats start with.