Search code examples
lispcommon-lispundefined-behavior

Difference between `do` and `do*` in Lisp?


I am working on these two functions that differ only in how ret and curr are assigned their value while the loop runs. In the first function ret and curr are bound in parallel; in the second function they are bound sequentially.

parallel binding

(defun maxpower (base maximum)
    "returns base ^ k such that it is <= maximum"
    (do ((ret 1 curr)               ; parallel
         (curr base (* base curr))) ; binding
        ((> curr maximum) ret)))

sequential binding

(defun maxpower* (base maximum)
    "returns base ^ k such that it is <= maximum"
    (do* ((ret 1 curr)                ; sequential
          (curr base (* base curr)))  ; binding
         ((> curr maximum) ret)))

Question: is the 1st function is somehow wrong (*) because curr is both updated and evaluated at the same time (in parallel)?

IOW: if I change the order of the bindings there should be no difference in the parallel version?
How does Lisp decide about the parallelization of the bindings?

In my tests, both functions return the same value, as they are.

(*): I come from C background; I'd say 1st function invokes Undefined Behaviour.


Solution

  • It's probably best to look at let versus let* first. If you understand that, then do versus do* follows from that, except for the additional consideration of the step-forms.

    Common Lisp is a strictly evaluated language. In both let and let*, the variable init-forms are evaluated left to right. The difference is in the scope and binding. Under let, all of the init forms are evaluated in a scope in which none of the variables are visible, whereas under let*, the forms are evaluated in an environment under which all the previous variables are visible. Secondly, since under let*, the previous variables are visible, their values are also established.

    Using let we can create a scope in which the values of two variables appear swapped:

    (let ((x y)
          (y x))
       ...)
    

    The initializing expressions y and x are evaluated first, in that order, and then the new values x and y are bound to the resulting values, which makes this possible.

    On the other hand:

    (let* ((a 1)
           (b (+ a 2)))
    

    Here, 1 is evaluated, and a is bound. This a is then visible to the (+ a 2) expression whose value is computed, and bound to b.

    Now, onto do/do*. These macros perform, prior to the first iteration, binding of the variables that is exactly like let/let*. In binding the variables, the difference between do and do* is exactly like between let and let*.

    The do/do* macros also have step-forms, which give the next value to their corresponding iteration variables. These step-forms are all in the scope of all of the variables, regardless of whether the macro operator is do or do*. Whether you're using do or do*, you can refer to any variable in any step form. The difference is when the assignment takes place. Under do, all of the step forms are evaluated from top to bottom, and then their corresponding variables are assigned new values for the next iteration. Under do*, the behavior is "assign as you go". As each step-form is evaluated, the corresponding variable is assigned. Therefore under do, when the step-form refers to any variable, it refers to its value from the prior iteration. Under do*, if the step-form refers to a lexically earlier variable, it's picking up the new value. If it refers to a lexically later variable, it's still seeing the old value from the prior iteration.

    We have to emphasize that although let and do have some "parallel" behavior, in a sense, there is no parallel evaluation. All visible effects are performed left to right. What appears to happen in parallel is the variables coming into existence, or being assigned new values in the new iteration. But this is only parallel in the sense that the program cannot observe the intermediate progress. For instance, the passage of function arguments into a function is likewise "parallel"; the program doesn't observe a state in which the function call is partially in progress, and only half the arguments have been passed.