Search code examples
rethinkdbrethinkdb-ruby

Get a collection and then changes to it without gaps or overlap


How do I reliably get the contents of a table, and then changes to it, without gaps or overlap? I'm trying to end up with a consistent view of the table over time.

I can first query the database, and then subscribe to a change feed, but there might be a gap where a modification happened between those queries.

Or I can first subscribe to the changes, and then query the table, but then a modification might happen in the change feed that's already processed in the query.

Example of this case:

A subscribe 'messages'
B add 'messages' 'message'
A <- changed 'messages' 'message'
A run get 'messages'
A <- messages

Here A received a 'changed' message before it sent its messages query, and the result of the messages query includes the changed message. Possibly A could simply ignore any changed messages before it has received the query result. Is it guaranteed that changes received after a query (on the same connection) were not already applied in the previous query, i.e. are handled on the same thread?

What's the recommended way? I couldn't find any docs on this use case.


Solution

  • Michael Lucy of RethinkDB Wrote:

    For .get.changes and .order_by.limit.changes you should be fine because we already send the initial value of the query for those. For other queries, the only way to do that right now is to subscribe to changes on the query, execute the query, and then read from the changefeed and discard any changes from before the read (how to do this depends on what read you're executing and what legal changes to it are, but the easiest way to hack it would probably be to add a timestamp field to your objects that you increment whenever you do an update).

    In 2.1 we're planning to add an optional argument return_initial that will do what I just described automatically and without any need to change your document schema.