compiler-construction programming-languages closures

How is a referencing environment generally implemented for closures?

Let's say I have a statically/lexically scoped language with deep binding and I create a closure. The closure will consist of the statements I want executed plus the so called referencing environment, or, to quote this post, the collection of variables which can be used.

What does this referencing environment actually look like implementation-wise? I was recently reading about ObjectiveC's implementation of blocks, and the author suggests that behind the scenes you get a copy of all of the variables on the stack and also of all the references to heap objects. The explanation claims that you get a "snapshot" of the referencing environment at the point in time of the closure's creation.

Is that more or less what happens, or did I misread that?
Is anything done to "freeze" a separate copy of the heap objects, or is it safe to assume that if they get modified between closure creation and the closure executing, the closure will no longer be operating on the original version of the object?
If indeed there's copying being made, are there memory usage considerations in situations where one might want to create plenty of closures and store them somewhere?

I think that misunderstanding of some of these concepts might lead to tricky issues like the ones Eric Lippert mentions in this blog post. It's interesting because you'd think that it wouldn't make sense to keep a reference to a value type that might be gone by the time the closure is called, but I'm guessing that in C# the compiler will figure out that the variable is needed later and put it into the heap instead.

It seems that in most memory-managed languages everything is a reference and thus ObjectiveC is a somewhat unique situation with having to deal with copying what's on the stack.

Solution

Here's a running example in pseudo-javascript-like syntax.

function f(x) {
  var y = ...;
  function g(z) {
    function h(w) {
      .... y, z, w ....
    }
    .... x, h ....
  }
  .... x, g ....
}

One representation is a linked chain of environments. That is, a closure consists of a code pointer, some slots, and a reference to the enclosing closure or the top-level environment. In this representation,

f = [<code>, <top-level-env>]
g = [<code>, f, x, y]
h = [<code>, g, z]

except sometimes it's better to let every function have a direct reference to the top-level environment, since it's used so often:

f = [<code>, <top-level-env>]
g = [<code>, <top-level-env>, f, x, y]
h = [<code>, <top-level-env>, g, z]

(There are other variations too.)

One advantage of this representation is that you can store mutable variables right in the closure. (Well, maybe, depending on how you represent function activations.) One disadvantage is some variables may take multiple hops to reach, if you have deeply nested closures. Another disadvantage is that if a closure outlives its parent (eg, g returns h) then this representation might prevent the GC from collecting environment frames that are mostly or even completely unreachable.

Another representation is "flat closures": each closure contains a code pointer and slots for all of the code's free variables.

g = [<code>, x, y]
h = [<code>, y, z]

This representation fixes the space/GC problem; no closure pins another closure in memory. On the other hand, free-variable slots are copied instead of shared, so if there is a nested closure with many free variables---or many instances of a nested closure---overall memory usage might be higher. Also, this representation typically requires storage for mutable variables to be heap-allocated (but only for variables that are actually mutated and only when the mutation cannot be automatically rewritten).

There are also hybrid approaches. For example, you might have mostly-flat closures but treat the top-level environment specially:

g = [<code>, <top-level-env>, x, y]

Or you might have a "sufficiently clever" (or at least "sufficiently ambitious") compiler that tries to pick between representations based on number of free variables, nesting depth, etc.