Search code examples
debuggingrecursionrackettail-recursionthe-little-schemer

[Little Schemer Ch3 pp.34 & 37]: Why (rember a (cdr lat)) as the 2nd argument of cons interpreted as unknown on p.37 example


I run both example on p.34 and p.37 using DrRacket debug mode step by step. And below are the stack windows results when processing (cdr lat) the first time of both examples.

p.34, the failed example without cons

(define rember
  (lambda (a lat)
    (cond
      ((null? lat) '())
      (else (cond
              ((eq? a (car lat)) (cdr lat))
              (else (rember a (cdr lat)))
              )))))

(rember 'c '(a b c d))

Stack area in debugger:

(cdr …)
(rember …)

p.37 with cons on last line:

(define rember
  (lambda (a lat)
    (cond
      ((null? lat) '())
      (else (cond
              ((eq? a (car lat)) (cdr lat))
              (else (cons (car lat)
                          (rember a (cdr lat)))))))))

(rember 'c '(a b c d))

Stack area in debugger:

(cdr …)
(rember …)
(rember …)

The Stack area with p.37's code indicates that the second call of rember has been classified as unknown before processing (cdr lat).

The only difference of 2 examples is p.37 added "cons". Cons takes 2 arguments, a s-expression and a list.

Without (cdr lat), rember itself doesn't return the list. And all the examples containing (cdr lat) within the first 40 pages of the book all have the same (function (cdr variable) format.

I don't understand why p.37 example rember itself being identified as unknown and justified for pending reduction while the contained (cdr lat) can be processed.

Or why rember in the position of 2nd argument of cons being interpreted that way.

Thanks!


Solution

  • TL;DR: what you see (and misinterpret) is the stack of function calls, and the effects of tail recursion.


    To answer your specific question about the debugger: your interpretation is wrong. What you see there is the run-time stack of function calls that got you to that specific point in execution timeline where you're at right now.

    It is not "unknown", not something "to be reduced later". You've already came through it, to your current execution point. What it is, is awaiting the results from the nested invocation, to continue doing its work with the results.

    If you click on Step few more times (with the p.37 code), you'll get to a deeper point where you'll see even more (rember)s appear in the Stack area. Your current execution point appears top-most on the Stack; the earliest – bottom-most.

    enter image description here

    Notice the Variables area shows the variables' values for that specific invocation frame.

    If you move your mouse cursor and hover over a lower (rember) and click on it, you'll see its variables' values:

    enter image description here

    Racket's debugger gets a bit getting used to.

    Also do notice the "last evaluated value" field in the top left corner shown in very small letters (in the previous image). This is a very important and useful piece of information, while debugging. It could use being a little bit more visible on the screen.

    The reason you don't see the stack of (rember)s grow with the first code (p.34),

    enter image description here

    is that it is tail recursive. There is nothing to be done with the result of the deep nested invocation of rember except to return it further back; so there's no need to save any state for that. This means the invocation frame for rember gets reused, replaced, and that's why you only ever see one of them, at the bottom of the Stack.

    But with the p.37's code there's more stuff to be done with the returned value - a preceding list element must be consed onto the result. This means that list element must be preserved, remembered somewhere. That somewhere is rember's invocation frame where that list element was accessed as (car lat), for that value of lat, at that point in execution timeline.

    Similarly, for all the other functions that have (else (function (cdr ... pattern, this means they are tail-recursive too. But if you see something like (else (cons ... (function (cdr ..., then they are not. The cons is in the way.


    To better see what's going on, we rewrite it in an equational pattern-matching pseudocode:

    (rember34 a lat) =
        { (null? lat)        ->  '() 
        ; (eq? a (car lat))  ->  (cdr lat)
        ; else               ->  (rember a (cdr lat))
        }
    

    This further simplifies to three clauses,

    rember34 a []          =  []
    rember34 a [a, ...as]  =  as
    rember34 a [b, ...as]  =  rember a as
    

    Is this pseudocode understandable enough just visually, without being explicitly explained? I hope that it is. The other definition is

    rember37 a []          =  [] 
    rember37 a [a, ...as]  =  as
    rember37 a [b, ...as]  =  [b, ...rember a as]
    

    Now just by looking at these definitions we can see the difference, and what each one does.

    The first, rember34, goes along the list (which is its second argument), (3rd clause) until it finds a in it (its first argument), and if it does (2nd clause), it returns the rest of the list at that point. If there was no a found (3rd clause) and we've reached the end of the list (1st clause) so that the list to continue along is now empty ([]), the empty list [] is returned (1st clause).

    Makes sense. For example,

    rember34 3 [1,2,3,4,5]              %   Tail-Recursive Call:
     = rember34 3 [2,3,4,5]             %    Just Returning The Result...
     = rember34 3 [3,4,5]               %     Call Frame Is Reused.
     = [4,5]
    
    rember34 3 [1,2] 
     = rember34 3 [2]
     = rember34 3 []
     = []
    

    The second, rember37, does the same but with one crucial difference: it keeps each non-matching element before the one that it finds and removes (as before). This means that if there was no such element found, the same list will be recreated. For example,

    rember37 3 [1,2,3,4,5] 
     = [1, ...rember37 3 [2,3,4,5]]              % the->
           => [2, ...rember37 3 [3,4,5]]         %     stack->
                  <= [4,5]                       %           grows
           <= [2,4,5]                            %      <-and
     = [1,2,4,5]                                 %  <-contracts
    
    rember37 3 [1,2] 
     = [1, ...rember37 3 [2]]                    % must remember 1,
           => [2, ...rember37 3 []]              %     and 2, to be used
                  <= []                          %          after the recursive call
           <= [2]                                %      has returned its result
     = [1,2]                                     %  to its caller
    

    Hopefully this clarifies things.


    a sidenote: under the tail-recursion modulo cons optimization, it'd be

    rember37 3 [1,2,3,4,5] 
     = [1, ...rember37 3 [2,3,4,5]]
     = [1, ...[2, ...rember37 3 [3,4,5]]]
     = [1,2, ...rember37 3 [3,4,5]]
     = [1,2, ...[4,5]]
     = [1,2,4,5]
    
    rember37 3 [1,2] 
     = [1, ...rember37 3 [2]]
     = [1, ...[2, ...rember37 3 []]]
     = [1,2, ...rember37 3 []]
     = [1,2, ...[]]
     = [1,2]
    

    which is much like it'd be under lazy evaluation as well!