I'm implementing a scheme-like lisp, that will at some point be compiled into some form of bytecode, which can be seen here. Unfortunately, I have coded myself into a bit of a hole, and am not sure how to get out of it. Essentially, given a lambda that looks like this (the outer b is purposely unused):
(lambda (a b) (lambda (c) (+ a c)))
My code yields this syntax tree:
[
type: :lambda,
args: [[type: :word, val: 'a'], [type: :word, val: 'b']],
body: [
[
type: :lambda,
args: [[type: :word, val: 'c']],
body: [
[
type: :expr,
val: [
[type: :word, val: '+'],
[type: :word, val: 'a'],
[type: :word, val: 'c']
]
]
]
]
]
]
Unfortunately, when I actually get to generating bytecode, it won't be easy to create the closures necessary for these lambdas (as far as I can tell). Ideally, I'd like to generate a tree that looks like this:
[
type: :lambda,
args: [[type: :word, val: 'a'], [type: :word, val: 'b']],
closure: [],
body: [
[
type: :lambda,
args: [[type: :word, val: 'c']],
closure: [[type: :word, val: 'a']],
body: [
[
type: :expr,
val: [
[type: :word, val: '+'],
[type: :word, val: 'a'],
[type: :word, val: 'c']
]
]
]
]
]
]
It is easy enough to tell if a given parameter should be part of the closure by seeing if it shows up in the env, however, since i just call Enum.map
over the body I'm not sure how to get that information back to my lambda object. I don't need specific code for how to fix this, but a general guideline/hint/push in the right direction would be great (I know it's a bit vague, but I wasn't quite sure how to make more a specific test case for this).
You can walk down your AST constructing a list of bound identifiers at each node.
E.g., a lambda
node binds its arguments (feel free to rewrite those names if they're already in your bound list), as well as let
and let*
. You also build a list of referenced free identifiers for each AST node while walking back up this tree.
lambda
, let
and let*
remove identifiers from those free variable lists.
The rest is easy: at every lambda
node you calculate the intersection between referenced and bound lists, and the result will be the environment this closure must capture. If it's empty, this is a simple function with no environment.
In your example, it'll be:
[b:() f:()](lambda (a b) [b:(a b) f:(a)] (lambda (c) [b: (a b c) f: (a c)] (+ a c)))
As you can see, the inner lambda have a
in common between its b:
and f:
lists, so you must emit a closure allocation instruction here, constructing an environment of one element, a
.
You can combine this pass with variable renaming (e.g., turning lambda argument names into argument numbers).