Search code examples
rethinkdbrethinkdb-javascript

rethinkdb | nested vs chained queries, any difference?


is there any difference between chained:

r.db('catbox').table("bw_mobile").filter(
    r.row("value")("appVersion")("major").le(2)
).filter(
  r.row("value")("appVersion")("minor").le(2)
).filter(
  r.row("value")("appVersion")("patch").le(10)
)

nested:

r.db('catbox').table("bw_mobile").filter(
      r.row("value")("appVersion")("major").le(2).and(
        r.row("value")("appVersion")("minor").le(2).and(
          r.row("value")("appVersion")("patch").le(10)
        )
      )
)

or lambda functions

r.db('catbox').table("bw_mobile").filter(
  r.js("(function (session) { 
        return session.value.appVersion.major < 0 
            || ( session.value.appVersion.major == 0 && session.value.appVersion.minor < 0 )
            || ( session.value.appVersion.major == 0 && session.value.appVersion.minor == 0 && session.value.appVersion.patch < 71 )
        ; 
    })")
)

TY!


Solution

  • I believe that the second case (single filter with multiple and expressions) is the most efficient and the most convenient to use. I would take the following ideas into account:

    r.filter, as it's documented, always creates a new selection, a stream or an array regardless the results of the predicate function passed to r.filter. I'm not sure how selections are implemented in RethinkDB (I believe they are stream-like), but arrays chaining may be an expensive operation allocating intermediate arrays. Compare this to Array.prototype.filter which creates a new array as its result. Streams are lazy thus each element is computed (or not) lazily as well, hence making a smaller memory foot-print. Compare this with iterators/streams and generators in other languages (Iterator<E>/Stream<E> in Java, IEnumerator<T> and yield return in .NET/C#, iterators and generator functions in JavaScript, yield in Python, pipes | in shell commands. etc) where you can combine iterators/generators. In any case you have intermediate filters.

    A single expression can replace a bunch of chained filter operations. Note the r.and operation in your expression has one very important feature: this is a short-circuit evaluation operation. If the left-hand operand of the AND operation is false, the operation does not even need to evaluate the right-hand expression to get the result that is always false then. You cannot do such a thing with r.filter. Compare this with the SQL WHERE clause that can be specified once per a single query (all false cases can be simply discarded by the AND operator). Also, from a pragmatic perspective, you can create a factory method that can have a convenient name and returns parameterized ReQL expressions, that can even be assigned to constants since ReQL expressions are immutable and safe to re-use:

    const maxVersionIs = (major, minor, patch) => r.row("value")("appVersion")("major").le(major)
        .and(r.row("value")("appVersion")("minor").le(minor))
        .and(r.row("value")("appVersion")("patch").le(patch));
    
    const versionPriorToMilestone = maxVersionIs(2, 2, 10);
    
    ...
    
    .filter(maxVersionIs(major, minor, patch))
    
    ...
    
    .filter(versionPriorToMilestone)
    

    ReQL expressions RethinkDB queries are actually expression trees that are much easier to parse and directly convert to execution plan than execute JavaScript scripts. The official documentation even recommends avoid use of r.js for better performance. I guess the cost here is JavaScript runtime setup, isolated script execution and checking for the script timeout. Additionally, scripts are more error-prone, whereas expression trees can be more or less inspected during compilation time. But, for the sake of completeness, r.js can be more powerful even with those costs, because ReQL is a limited set of operations. From my personal experience: I had to implement a sort of a permission-checking subsystem based on RethinkDB, and I needed to have a bitwise AND operation in RethinkDB. Unfortunately, RethinkDB as of 2.3 does not support bitwise operations, so I had to use r.js: r.js('(function (user) { return !!(user.permissions & ${permissions}); })'). A future release of RethinkDB will support bitwise operations, so r.getField('permissions').bitAnd(permissions)) should work someday in the future faster and combinable with other expressions to fit in a single filter.