Search code examples
lispundefined-variable

It seems only lisp allow use a variable without define it first?


I found it seems only in lisp can define this:

(lambda (x)(+ x y 1))

in other languages,even it's functional language,a variable must define first,then to use it

so in lisp,there is concept "free variable","bound variable".Is other language have this concept?because all variable must define first,so all variable is "bound variable"?The variable have an initial value,in the global or other scope?

I think the lambda concept is early,but if any variable,or value,must define first then to use,isn't it easyier to understand and to use?

Thanks!


Solution

  • I think this question has two parts. There is a general question of free variable access in Lisp (and other languages), which is a large subject since there are many, many Lisps and even more other languages, including ones with dynamic scope for some variables (CL) and ones which have only dynamic scope (elisp until recently, many other old implementations). There is also a question about when variable references need to get resolved in Lisps (and other languages), and in particalar the need for forward references and when and how they get resolved..

    In this answer I am only addressing the second part, about forward references, which I think is what is directly confusing you. I will give examples using Racket, with the r5rs language, as that is what you have used in a comment.

    First of all, any language which supports recursion without jumping through extreme hoops (aka using the Y combinator) needs to support references to names which are not yet bound, which are called forward references. Consider this:

    #lang r5rs
    
    (define foo
      (lambda (x)
        (if (null? x)
            0
            (+ 1 (foo (cdr x))))))
    

    OK, so when the function is evaluated there is a reference to foo for the recursive call. But foo is not yet bound at that point, because foo is going to be bound to the function itself and that's what we're defining. And indeed if you do something like this:

    (let ((r (lambda (x)
               (if (null? x)
                   0
                   (+ 1 (r (cdr x)))))))
      r)
    

    This is an error, because r is indeed not yet defined. Instead you need to use letrec which does suitable magic to ensure that things work.

    Well, you might argue that letrec turns into something like this:

    (let ((r #f))
      (set! r (lambda (x)
                (if (null? x)
                    0
                    (+ 1 (r (cdr x))))))
      r)
    

    And perhaps it does. But what about this?

    #lang r5rs
    
    (define foo
      (lambda (x)
        (if (null? x)
            0
            (bar x))))
    
    (define bar
      (lambda (x)
        (+ 1 (foo (cdr x)))))
    

    And for on for more elaborate programs.

    So there is a general problem here about forward reference. It is essentially impossible to write programs in such a way that, in the text of the program, there are not references to names which are not yet known about. Note that in the above program foo refers to bar and bar refers to foo so there is no order of the definitions of foo and bar which does not involve references to names which are not yet bound in the source.

    Note that this problem affects essentially all programming languages: this is nothing unique to Lisp-family languages.

    So, how is this resolved? Well, the answer is 'it depends on how the specific language you care about is defined'. I am not completely sure what the R5RS specification says about this, but I think I am sure what Racket says about it.

    What Racket says is that forward references all need to be sorted out at the module level. So if I have a Racket source file like this:

    #lang r5rs
    
    (define a
      (lambda (x) (+ x y)))
    
    (define y 1)
    

    This is fine, because at the end of the module both a and x are defined, and so the definition of a is fine. This is no different than the definitions of foo and bar above, and note that a source file like this:

    #lang r5rs
    
    (define foo
      (lambda (x)
        (if (null? x)
            0
            (bar x))))
    

    is not legal: bar is not bound:

    $ racket ts.rkt
    ts.rkt:7:9: bar: unbound identifier
    [...]
    

    So the answer to the question of forward-reference is two-fold:

    • writing programs in almost any practical language, and in particular in Lisps, requires forward references to names which are not yet bound at the point of reference;
    • language specifications need to define when and how such forward references get resolved.

    I think that this is what is confusing you. In particular a Racket module which contains only this:

    #lang r5rs
    
    (define a (lambda (x) (+ x y)))
    

    Is not legal in Racket. But this module:

    #lang r5rs
    
    (define a (lambda (x) (+ x y)))
    (define y 1)
    

    Is legal, because y is bound at the point when forward references get resolved.


    Common Lisp is more complicated than this for two reasons:

    • it's a Lisp-2, so function and variable bindings live in different namespaces;
    • there are no standard toplevel lexical variable bindings.

    So for instance if we take something like this (assumed to be at toplevel, with no other user definitions preceeding it):

    (defun foo (x)
      (+ x y))
    

    Then y can't be a reference to a lexical variable, because the lexical environment of foo is empty, since there are no toplevel lexical variable bindings.

    So y must be a reference to a dynamic variable (a special variable in CL speak). But in fact, CL resolves the forward-reference problem for dynamic variables by saying that such references are not allowed. So the above definition is not legal CL. Many compilers will compile it with a warning, but they don't have to.

    The following variant, however, is fine (note I've used the CL convention for dynamic variables).

    (defvar *y* 1)
    
    (defun foo (x)
      (+ x *y*))
    

    This is fine because *y* is known about before foo is defined.

    And in fact I lied: forward references to dynamic vaiables are allowed, but you have to tell the language about them:

    (defun foo (x)
      (declare (special *y*))
      (+ x *y*))
    

    And now, if there is a later (defvar *y* ...) (or equivalent) a call to foo will be fine. If there is no such globally-special proclamation of *y* then I think what happens if foo is called is either undefined or an error (and I kind of hope the latter but I'm not sure.

    Finally, even without a global special proclamation for *y*, this is also fine:

    (let ((*y* 3))
      (declare (special *y*))
      (foo 1))
    

    This is fine because *y* is locally declared special (there is a bound declaration for it).

    Forward references to functions are allowed in CL, and there are complicated rules about when they need to be resolved which I am not sure I can remember.