I have read in several places such as here about "Access Links" and "The Display Method" for looking up non-local variables. However, none seem to touch on the complicated situation like in JavaScript where you can have functions inside of functions, and return those functions from functions, all the while keeping alive the "internal scope" of those functions.
Here is a complicated nested function.
let m = foo()
m()
m()
function foo() {
let x = 10
function bar() {
let y = 2 ** x
function hello() {
let z = 30
function another() {
// PLACE_2
let q = x + y + z
function nested() {
q += (17 * x) - (3 * z)
// PLACE_1
console.log(q)
console.log(w)
}
return nested
}
// PLACE_3
let w = another()
return w
}
return hello
}
// PLACE_4
let a = bar()
let nested = a() // hello returns another
nested()
nested()
return nested
}
At PLACE_1
, we are using variables x/z/q, but not y/a/nested. Basically we create all these lingering objects in the scope tree.
How does an activation record with "Access Links" or "Display Method" work in a situation like this? I am used to having functions call other functions, but not keep the environment of past functions around. How do these work in a complex system like this with scopes lingering?
I am trying to implement a compiler and am currently stuck on how to actually implement the Activation Records. I can do it where the activation records are a simple parent/child relationship based on calling order, but I don't see how to make it work with a situation like this. I want to have a sort of variable lookup mechanism, but I don't know what needs to be stored in a situation like this.
Reading something like this doesn't give any insight into real-world situations like this, how it's modeled under the hood.
Displays can only be used if the semantics of the language don't allow non-local variables to outlive their original scope, since the display is basically an efficient eay to find an activation record on the stack. If you want to implement true closures, you'll need to use a different mechanism which involves heap-allocation of enclosed variables. (Unless you can prove that the enclosed variable never escapes. That's often the case, but it can be hard to prove.)
AUUI, Javascript retains the entire activation record of the function call which created the closed variable. But that's dependent on the semantics of the language, and unnecessarily keeping active links to other variables in the activation record prevents them from being garbage collected in a timely manner. It should only be necessary to keep links to the variables actually being enclosed.
Note that there are two distinct cases (again, this depends on language semantics). In the simple case, the captured variable is immutable (although it might contain a mutable value). In that case, you can capture the value by adding it as a data member in the closure. (The closure object itself needs to be heap-allocated but you don't need to do that until the closure is actually created, which might never happen.)
In the more complicated case, the variable itself is mutable and potentially mutated by the inner function [Note 1]. In that case, the variable itself must be "boxed"; that is, enclosed in a container which is used as a handle for the variable. But the mutation might happen before the variable's scope terminates, in which case the change needs to be visible in that scope. If the variable is boxed, the outer function needs to be aware of that because access to the variable is indirect. So boxing the variable onto the heap is a cost which must be paid before it is known to be necessary (depending on the quality of the escape analysis).
That's where the Lua implementation is interesting, at least. It manages to avoid heap-allocating the box in many cases, but there is an additional cost when it does so. That seems to work out well for Lua use cases but I don't know whether it would be a good solution in your case.
There's a very common case where a variable is mutable because its value cannot conveniently be computed in a single initialisation expression, but all mutations occur before the variable is captured. Such a variable can be treated as though it were immutable (or as though it was computed and then made the value of a different variable with the same name). That case is fairly easy to detect. (There are also variables whose last mutation is after capture but before the first invocation of a capturing function. Those can also usually be detected.) These variables can be captured as though they were constant, just putting the value into the closure.
Informal surveys indicate that the vast majority of captured variables are either constant or constant-after-capture, which could considerably reduce the need for heap-allocated boxes.