Search code examples
rethinkdbreql

Why RethinkDB's count operation is so slow?


I am trying to benchmarking task for some queries in RethinkDB. I really did not get good answer of a question Why RethinkDB's count() operation is so slow?

I have a query with 2GB of data:

r.db("2GB").table("table").between(40, r.maxval, {index:"price"})

The query is executed in 5 milliseconds But once I would like count the number items like

r.db("2GB").table("table").between(40, r.maxval, {index:"price"}).count()

It took more than 6 seconds Every query that uses count operation is very slow. I have seen many issues in github but could not get exact reason.

Update: it's not just between() but all other like filter ....the count() is horribly slow


Solution

  • When you call between you get back a cursor, which loads data off disk lazily as you iterate over it. So the amount of time necessary to return the cursor is the amount of time necessary to read the first batch of your data, not all of your data. count, at the other hand, has to look at the whole table before it can return, so it takes time proportional to the size of your table.