Search code examples
concurrencyschemerace-conditionsicp

Why an expression using a global variable multiple times reads that variable multiple times instead of just once?


At page 304 from SICP, the following snippet is shown

(define x 10)

(parallel-execute (lambda () (set! x (* x x)))
                  (lambda () (set! x (+ x 1))))

about which it is said that

x will be left with one of five possible values

At this point I was a bit perplexed. I thought that the two processes could interfere only in consequence of their (lack of) coordination in reading from and writing to x. So I'd have expected the following scenarii, where a/A is the first process' read/write of x and b/B is the second process' read/write of x,

  • abAB -> 11
  • baAB -> 11
  • baBA -> 100
  • abBA -> 100
  • aAbB -> 101
  • bBaA -> 121

which sum up to 4 possible results.

But the book claims there's a 5th one, 110, which hinges on the idea that the first process reads x twice, in which case a possible scenario is that it reads 10 the first time, before the second process writes 11 to it, and 11 the second time, afterwards, ending up multiplying 10 by 11, thus obtaining 110.

Why is that? Why does the first process need to read x twice? And if it doesn't need, then why was the language designed this way? What are the advantages in doing so?


By the way, I'm curious to know if this issue is common to all/some/none of the other languages, for instance, in C++.


Solution

  • In general I think you should assume that code is evaluated as it is written, unless the language specifically allows it not to be. And for programs with concurrency anything could happen between any steps of the evaluation.

    That assumption leads to the five possibilities specified by SICP.

    In fact this problem happens even for programs without concurrency. For instance let's say I've written something like this:

    (define x 1)
    
    (define (g f)
      (+ x (f x) x))
    

    And assume that the evaluator evaluates function arguments left-to-right (Scheme does not specify this). It's tempting to say that this could, at least, be turned into

    (define x 1)
    
    (define (g f)
      (let ((v x))
        (+ v (f x) v)))
    

    And maybe if you know that + is what it looks like then you could even do some more simplification, possibly at the cost of floating-point problems.

    But none of those are safe, because I can say

    (g (lambda (y) (set! x (* y y))))
    

    Except, of course, maybe when you read the gory details of the language standard you'll find complicated wording about what assumptions are allowed to be made by the evaluator, and what assumptions are not allowed to be made.

    But if you assume that the evaluator does what it looks like it does then the things you do to ensure that the program behaves properly will always err on the side of being too safe, rather than not safe enough. And that's the assumption you want to make. To return to the SICP code; if you wanted there only to be four possible outcomes, you could change it to this:

    (define x 10)
    
    (parallel-execute (lambda () (let ((y x)) (set! x (* y y))))
                      (lambda () (set! x (+ x 1))))
    

    And you know that this is right, even if it turns out that the evaluator is allowed to make assumptions about repeated variables.