Understanding stress-ref in The Joy of Clojure Chapter 11

I can't completely understand the behaviour of `stress-ref' in The Joy of Clojure 11.2.5. My question is, why the reading of r require a large history?

Solution

Background:

For starters, it may help to do some reading on multiversion concurrency control, the mechanics underlying Clojure's software transactional memory (STM; reference types).

The nutshell answer:

Within the context of a transaction, consistency is vital: for any given ref that is read from, it's important that only a single value be associated with that ref for the duration of the transaction. As such, it's important that the most recent value of the reference when the transaction starts remain available for the entirety of the transaction. That's why history is important; Without history, any change to the ref while the read transaction runs would lead to a retry, since the start value of the ref wouldn't be available anymore. With history, we can be a bit more lax, and instead of requiring the most recent value of the ref throughout the entire course of the transaction, we can live with at least having a consistent value throughout the course of the transaction.

Some illustrative examples:

The toy example presented in JOC does not really speak to this importance, but hopefully this will help illustrate the concerns:

(defn stress-ref [r]
  (let [slow-tries (atom 0)]
    ;One long-running transaction
    (future
      (dosync
        (swap! slow-tries inc)
        (println "1st r read:" @r)  ; do something with the ref
        (Thread/sleep 200)          ; do some work
        (println "2nd r read:" @r)) ; do something else with the ref
      (println (format "transaction complete. r is: %s, history: %d, after: %d tries"
                       @r (.getHistoryCount r) @slow-tries)))
    ; 500 very quick transactions
    (dotimes [i 500]
      (Thread/sleep 10)
      (dosync (alter r inc)))
    :done))

(stress-ref (ref 0 :min-history 20 :max-history 30))

This returns:

1st r read: 0
2nd r read: 0
transaction complete. r is: 19, history: 19, after: 1 tries
:done

As you can see, the value of the ref through the entire transaction is 0. It would be rather weird if this value changed throughout the course of the transaction.

However, once the transaction completes, the value has already incremented to 19. Since this happens immediately after the "2nd read", we can take this as evidence that throughout the course of the transaction the ref history is being used so that we have consistency.

A little deeper on the lifecycle of a transaction:

To glean a little more insight into what is going on throughout the course of the transaction, we can tweak our min history so that we force a few retries:

user=>     (stress-ref (ref 0 :min-history 15 :max-history 30))
1st r read: 0
1st r read: 19
1st r read: 39
1st r read: 59
1st r read: 79
2nd r read 79
transaction complete. r is: 99, history: 19, after: 5 tries

Note that in this case, the history is insufficient to start off. The first time the "2nd read" is tried, the transaction is restarted because it doesn't have enough history to still have the value of the ref that it had when the transaction started. Rather than continue with an inconsistency in this value, the transaction is restarted with a longer history. That history still isn't long enough, so it restarts again, (etc.) until the history has been increased enough to have the same value throughout the entire transaction. At that point, the second read can complete successfully, and the kingdom can rejoice.

Something different:

You could say "But why don't we just locally cache the value of r that we started off with? Then we wouldn't have to worry about all this history nonsense." To some extent that's true, but then we wouldn't be doing MCC (proper) anymore; We'd be doing some other form of concurrency control. I'm guessing that the main advantage of not doing this is implementation complexity. Another though is likely that using history lets you force retries when a ref has changed more than you'd ideally like over the course of a transaction. So, there are trade-offs.