Search code examples
distributed

Why are quorum reads and writes with read repair not linearizable


From designing data intensive applications:

Cassandra does wait for read repair to complete on quorum reads [27], but it loses linearizability if there are multiple concurrent writes to the same key, due to its use of last-write-wins conflict resolution.

I've read elsewhere that read repair needs to be atomic in order to make the system linearizable

However in Martin kleppmann's youtube series, he states read repairs on quorum reads and writes are linearizable.

I'm confused by this, actually intuitively it makes sense that read repairs are linearizable even if they are not atomic because it delays the read response back to the client, causing the conflicting read to be concurrent with other reads.

The downside of a system like this that I see are that concurrent writes can be dropped while sending back a success msg to the client.


Solution

  • Writes in this kind of "non-Paxos/Raft" systems provide much less consistency due to the fact that these writes are not atomic. While some particular write has not finished some nodes may have saved a new value already, while others - not; it could be also due to the fact that multiple concurrent writes are in progress - hence, the whole system is (temporarily, of course) inconsistent and eventually (hopefully!) will become consistent again by applying the recovery strategy you mentioned correctly.

    That would have been different (and inherently significantly slower) with Paxos or Raft in place. Having either one orchestrating each write transaction would have made writes atomic and the issues would've gone once and forever (provided, a Paxos or Raft implementation is correct enough, which is by itself a nontrivial code to write and tenfold so to test).

    So, in short, once write has completed successfully, the system behaves exactly as you'd expect it to; however, while writing is in progress or overlaps with a concurrent conflicting writing or simply failed - all sort of typical-for-the-eventually-consistent-paradigm problems arise and bring a fair amount of chaos with them. Please read this post for more details.


    Hope I understood your question well and the answer I gave is appropriate; if not, then please ask more precise question.