Search code examples
recursionclosuresschemeguilethe-little-schemer

Performance Impact of Creating Enclosed Procedure in Scheme During Recursion


I'm making my way through the book The Little Schemer to start to learn to think in Lisp. As you get into it and really cover the use of lambdas, the 'remove' procedure is written in the following general form, which returns a remove procedure for arbitrary test test?:

(define rember-f
  (lambda (test?)
    (lambda (a l)
      (cond
        ((null? l) (quote ()))
        ((test? (car l) a) (cdr l))
        (else (cons (car l)
                ((rember-f test?) a (cdr l))))))))

I understand how this works just fine, but a plain reading of it suggests that at each recursive step, it is the procedure rember-f that is called again to generate a new enclosed procedure. This would mean when you call your returned procedure on a list, it calls rember-f to generate the same procedure again anew and then that new one is what is called for recursion (if that is not clear see my fix below). I understand that this may be optimized away, but in lieu of not knowing whether it is (and also in attempting to get my head around this syntax anyway), I managed after some experimentation to move the recursion to the procedure itself rather than the enclosing procedure as follows:

(define rember-f
  (lambda (test?)
    (define retfun
      (lambda (a l)
        (cond
          ((null? l) (quote ()))
          ((test? (car l) a) (cdr l))
          (else (cons (car l) (retfun a (cdr l)))))))
    retfun))

I have verified that this works as expected. The return value is a procedure that removes the first element of a list (arg 2) matching a value (arg 1). It looks to me like this one only calls rember-f once, which guarantees it only generates one enclosed procedure (this time with a name, retfun).

This is actually interesting to me because unlike the usual tail call optimization, which is about not consuming space on the call stack and so making recursion about as efficient as iteration, in this case the compiler would have to determine that (rember-f test?) is the enclosing procedure scope unmodified and so replace it with the same return value, which is the anonymous (lambda (a l) ...). It would not surprise me at all to learn that the interpreter / compiler does not catch this.

Yes, I know that scheme is a specification and there are many implementations, which get the various functional programming optimizations right to differing degrees. I am currently learning by experimenting in the guile REPL, but would be interested in how different implementations compare on this issue.

Does anyone know how Scheme is supposed to behave in this instance?


Solution

  • You are right to be concerned about the additional repeated lambda abstractions. For example you wouldn't write this, would you?

    (cond ((> (some-expensive-computation x) 0) ...)
          ((< (some-expensive-computation x) 0) ...)
          (else ...))
    

    Instead we bind the result of some-expensive-computation to an identifier so we can check multiple conditions on the same value -

    (let ((result (some-expensive-computation x)))
     (cond ((> result 0) ...)
           ((< result 0) ...)
           (else ...)))
    

    You discovered the essential purpose of so-called "named let" expressions. Here's your program -

    (define rember-f
      (lambda (test?)
        (define retfun
          (lambda (a l)
            (cond
              ((null? l) (quote ()))
              ((test? (car l) a) (cdr l))
              (else (cons (car l) (retfun a (cdr l)))))))
        retfun))
    

    And its equivalent using a named-let expression. Below we bind the let body to loop, which is a callable procedure allowing recursion of the body. Notice how the lambda abstractions are used just once, and the inner lambda can be repeated without creating/evaluating additional lambdas -

    (define rember-f
      (lambda (test?)
        (lambda (a l)
          (let loop ; name, "loop", or anything of your choice
           ((l l))  ; bindings, here we shadow l, or could rename it
           (cond
             ((null? l) (quote ()))
             ((test? (car l) a) (cdr l))
             (else (cons (car l) (loop (cdr l))))))))) ; apply "loop" with args
    

    Let's run it -

    ((rember-f eq?) 'c '(a b c d e f))
    
    '(a b d e f)
    

    The syntax for named-let is -

    (let proc-identifier ((arg-identifier initial-expr) ...)
      body ...)
    

    Named-let is a syntax sugar of a letrec binding -

    (define rember-f
      (lambda (test?)
        (lambda (a l)
          (letrec ((loop (lambda (l)
                           (cond
                             ((null? l) (quote ()))
                             ((test? (car l) a) (cdr l))
                             (else (cons (car l) (loop (cdr l))))))))
            (loop l)))))
    
    ((rember-f eq?) 'c '(a b c d e f))
    
    '(a b d e f)
    

    Similarly, you could imagine using a nested define -

    (define rember-f
      (lambda (test?)
        (lambda (a l)
          (define (loop l)
            (cond
              ((null? l) (quote ()))
              ((test? (car l) a) (cdr l))
              (else (cons (car l) (loop (cdr l))))))
          (loop l))))
    
    ((rember-f eq?) 'c '(a b c d e f))
    
    '(a b d e f)
    

    PS, you you can write '() in place of (quote ())