Search code examples
transactionsmarklogicmvcc

Marklogic MVCC :contemporaneous vs non-blocking


I ma trying to understand the contemporaneous and non-blocking parameters with the help of an example. Please let me know if I am correct.

Assume we have transactions T1, T2, T3 occurring at time stamp = 10.
T1, T2, T3 commit at 30, 40, 50 respectively. If a query transaction comes at 35:

for contemporaneous: The query reads a version with T1 committed, and keeps T2 and T3 waiting till the read is done.

For non-blocking: The query gets to read only after all 3 transaction T1,T2,T3 are committed at 50.


Solution

  • This is easiest to understand and matters most if you think about querying on a disaster recovery replica cluster. In a DR setup each forest on the primary is replicating its journal frames to matching forests on the replica. There are often multiple forests in a database, and because replication is at the forest level some forests might have slightly later data than others.

    Now imagine a (read-only) query comes in to the database on the replica. You have two choices. One, you could run the query at the last timestamp for which you have all the data (that's nonblocking). Two, you could run the query at the last timestamp you see any data for (in the forest furthest along) and wait for all the data to arrive (for the other forests) so you can get a transactionally consistent view for that later time (that's contemporaneous).

    Notice that both options are transactionally consistent. It's just about how the database chooses the timestamp at which you want the query to run.

    Read-only queries always run lock-free.