Search code examples
lispsymbolslambda

Lisp changes function to lambda expression when stored in function cell


In this post, I ask tangentially why when I declare in SBCL

(defun a (&rest x)
    x)

and then check what the function cell holds

(describe 'a)
COMMON-LISP-USER::A
  [symbol]

A names a compiled function:
  Lambda-list: (&REST X)
  Derived type: (FUNCTION * (VALUES LIST &OPTIONAL))
  Source form:
    (LAMBDA (&REST X) (BLOCK A X))

I see this particular breakdown of the original function. Could someone explain what this output means? I'm especially confused by the last line

 Source form:
        (LAMBDA (&REST X) (BLOCK A X))

This is mysterious because for some reason not clear to me Lisp has transformed the original function into a lambda expression. It would also be nice to know the details of how a function broken down like this is then called. This example is SBCL. In Elisp

(symbol-function 'a)

gives

(lambda (&rest x) x)

again, bizarre. As I said in the other post, this is easier to understand in Scheme -- but that created confusion in the answers. So once more I ask, Why has Lisp taken a normal function declaration and seemingly stored it as a lambda expression?


Solution

  • I'm still a bit unclear what you are confused about, but here is an attempt to explain it. I will stick to CL (and mostly to ANSI CL), because elisp has a lot of historical oddities which just make things hard to understand (there is an appendix on elisp). Pre-ANSI CL was also a lot less clear on various things.

    I'll try to explain things by writing a macro which is a simple version of defun: I'll call this defun/simple, and an example of its use will be

    (defun/simple foo (x)
      (+ x x))
    

    So what I need to do is to work out what the expansion of this macro should be, so that it does something broadly equivalent (but simpler than) defun.

    The function namespace & fdefinition

    First of all I assume you are comfortable with the idea that, in CL (and elisp) the namespace of functions is different than the namespace of variable bindings: both languages are lisp-2s. So in a form like (f x), f is looked up in the namespace of function bindings, while x is looked up in the namespace of variable bindings. This means that forms like

    (let ((sin 0.0))
      (sin sin))
    

    are fine in CL or elisp, while in Scheme they would be an error, as 0.0 is not a function, because Scheme is a lisp-1.

    So we need some way of accessing that namespace, and in CL the most general way of doing that is fdefinition: (fdefinition <function name>) gets the function definition of <function name>, where <function name> is something which names a function, which for our purposes will be a symbol.

    fdefinition is what CL calls an accessor: this means that the setf macro knows what to do with it, so that we can mutate the function binding of a symbol by (setf (fdefinition ...) ...). (This is not true: what we can access and mutate with fdefinition is the top-level function binding of a symbol, we can't access or mutate lexical function bindings, and CL provides no way to do this, but this does not matter here.)

    So this tells us what our macro expansion needs to look like: we want to set the (top-level) definition of the name to some function object. The expansion of the macro should be like this:

    (defun/simple foo (x)
      x)
    

    should expand to something involving

    (setf (fdefinition 'foo) <form which makes a function>)
    

    So we can write this bit of the macro now:

    (defmacro defun/simple (name arglist &body forms)
      `(progn
         (setf (fdefinition ',name)
               ,(make-function-form name arglist forms))
         ',name))
    

    This is the complete definition of this macro. It uses progn in its expansion so that the result of expanding it is the name of the function being defined, which is the same as defun: the expansion does all its real work by side-effect.

    But defun/simple relies on a helper function, called make-function-form, which I haven't defined yet, so you can't actually use it yet.

    Function forms

    So now we need to write make-function-form. This function is called at macroexpansion time: it's job is not to make a function: it's to return a bit of source code which will make a function, which I'm calling a 'function form'.

    So, what do function forms look like in CL? Well, there's really only one such form in portable CL (this might be wrong, but I think it is true), which is a form constructed using the special operator function. So we're going to need to return some form which looks like (function ...). Well, what can ... be? There are two cases for function.

    • (function <name>) denotes the function named by <name> in the current lexical environment. So (function car) is the function we call when we say (car x).
    • (function (lambda ...)) denotes a function specified by (lambda ...): a lambda expression.

    The second of these is the only (caveats as above) way we can construct a form which denotes a new function. So make-function-form is going to need to return this second variety of function form.

    So we can write an initial version of make-function-form:

    (defun make-function-form (name arglist forms)
      (declare (ignore name))
      `(function (lambda ,arglist ,@forms)))
    

    And this is enough for defun/simple to work:

    > (defun/simple plus/2 (a b)
        (+ a b))
    plus/2
    
    > (plus/2 1 2)
    3
    

    But it's not quite right yet: one of the things that functions defined by defun can do is return from themselves: they know their own name and can use return-from to return from it:

    > (defun silly (x)
        (return-from silly 3)
        (explode-the-world x))
    silly
    
    > (silly 'yes)
    3
    

    defun/simple can't do this, yet. To do this, make-function-form needs to insert a suitable block around the body of the function:

    (defun make-function-form (name arglist forms)
      `(function (lambda ,arglist
                   (block ,name
                     ,@forms))))
    

    And now:

    > (defun/simple silly (x)
        (return-from silly 3)
        (explode-the-world x))
    silly
    
    > (silly 'yes)
    3
    

    And all is well.

    This is the final definition of defun/simple and its auxiliary function.

    Looking at the expansion of defun/simple

    We can do this with macroexpand in the usual way:

    > (macroexpand '(defun/simple foo (x) x))
    (progn
      (setf (fdefinition 'foo) 
            #'(lambda (x) 
                (block foo
                  x)))
      'foo)
    t
    

    The only thing that's confusing here is that, because (function ...) is common in source code, there's syntactic sugar for it which is #'...: this is the same reason that quote has special syntax.

    It's worth looking at the macroexpansion of real defun forms: they usually have a bunch of implementation-specific stuff in them, but you can find the same thing there. Here's an example from LW:

     > (macroexpand '(defun foo (x) x))
    (compiler-let ((dspec::*location* '(:inside (defun foo) :listener)))
      (compiler::top-level-form-name (defun foo)
        (dspec:install-defun 'foo
                             (dspec:location)
                             #'(lambda (x)
                                 (declare (system::source-level
                                           #<eq Hash Table{0} 42101FCD5B>))
                                 (declare (lambda-name foo))
                                 x))))
    t
    

    Well, there's a lot of extra stuff in here, and LW obviously has some trick around this (declare (lambda-name ...)) form which lets return-from work without an explicit block. But you can see that basically the same thing is going on.

    Conclusion: how you make functions

    In conclusion: a macro like defun, or any other function-defining form, needs to expand to a form which, when evaluated, will construct a function. CL offers exactly one such form: (function (lambda ...)): that's how you make functions in CL. So something like defun necessarily has to expand to something like this. (To be precise: any portable version of defun: implementations are somewhat free to do implementation-magic & may do so. However they are not free to add a new special operator.)

    What you are seeing when you call describe is that, after SBCL has compiled your function, it's remembered what the source form was, and the source form was exactly the one you would have got from the defun/simple macro given here.


    Notes

    lambda as a macro

    In ANSI CL, lambda is defined as a macro whose expansion is a suitable (function (lambda ...)) form:

    > (macroexpand '(lambda (x) x))
    #'(lambda (x) x)
    t
    
    > (car (macroexpand '(lambda (x) x)))
    function
    

    This means that you don't have to write (function (lambda ...)) yourself: you can rely on the macro definition of lambda doing it for you. Historically, lambda wasn't always a macro in CL: I can't find my copy of CLtL1, but I'm pretty certain it was not defined as one there. I'm reasonably sure that the macro definition of lambda arrived so that it was possible to write ISLisp-compatible programs on top of CL. It has to be in the language because lambda is in the CL package and so users can't portably define macros for it (although quite often they did define such a macro, or at least I did). I have not relied on this macro definition above.

    defun/simple does not purport to be a proper clone of defun: its only purpose is to show how such a macro can be written. In particular it doesn't deal with declarations properly, I think: they need to be lifted out of the block & are not.

    Elisp

    Elisp is much more horrible than CL. In particular, in CL there is a well-defined function type, which is disjoint from lists:

    > (typep '(lambda ()) 'function)
    nil
    
    > (typep '(lambda ()) 'list)
    t
    
    > (typep (function (lambda ())) 'function)
    t
    
    > (typep (function (lambda ())) 'list)
    nil
    

    (Note in particular that (function (lambda ())) is a function, not a list: function is doing its job of making a function.)

    In elisp, however, an interpreted function is just a list whose car is lambda (caveat: if lexical binding is on this is not the case: it's then a list whose car is closure). So in elisp (without lexical binding):

    ELISP> (function (lambda (x) x))
    (lambda (x)
      x)
    

    And

    ELISP> (defun foo (x) x)
    foo
    ELISP> (symbol-function 'foo)
    (lambda (x)
      x)
    

    The elisp intepreter then just interprets this list, in just the way you could yourself. function in elisp is almost the same thing as quote.

    But function isn't quite the same as quote in elisp: the byte-compiler knows that, when it comes across a form like (function (lambda ...)) that this is a function form, and it should byte-compile the body. So, we can look at the expansion of defun in elisp:

    ELISP> (macroexpand '(defun foo (x) x))
    (defalias 'foo
      #'(lambda (x)
          x))
    

    (It turns out that defalias is the primitive thing now.)

    But if I put this definition in a file, which I byte compile and load, then:

    ELISP> (symbol-function 'foo)
    #[(x)
      "\207"
      [x]
      1]
    

    And you can explore this a bit further: if you put this in a file:

    (fset 'foo '(lambda (x) x))
    

    and then byte compile and load that, then

    ELISP> (symbol-function 'foo)
    (lambda (x)
      x)
    

    So the byte compiler didn't do anything with foo because it didn't get the hint that it should. But foo is still a fine function:

    ELISP> (foo 1)
    1 (#o1, #x1, ?\C-a)
    

    It just isn't compiled. This is also why, if writing elisp code with anonymous functions in it, you should use function (or equivalently #'). (And finally, of course, (function ...) does the right thing if lexical scoping is on.)

    Other ways of making functions in CL

    Finally, I've said above that function & specifically (function (lambda ...)) is the only primitive way to make new functions in CL. I'm not completely sure that's true, especially given CLOS (almost any CLOS will have some kind of class instances of which are functions but which can be subclassed). But it does not matter: it is a way and that's sufficient.