Search code examples
scoperebolred-lang

Is there a overall explanation about definitional scoping in Rebol and Red


From the REBOL/Core Users Guide, and What is Red, I have learned that both Rebol and Red use definitional scoping.

From the guide, I know it is a form of static scoping, "the scope of a variable is determined when its context is defined", and is also called runtime lexical scoping, and is a dynamic form of static scoping that depends on context definitions.

I know in com-sci, there are two forms of scoping: lexical scoping (static scoping) and dynamic scoping. This definitional scoping confused me.

So what is definitional scoping?


Solution

  • Rebol actually does not have scoping at all.

    Let's take this code:

    rebol []
    
    a: 1
    
    func-1: func [] [a]
    
    inner: context [
        a: 2
        func-2: func [] [a]
        func-3: func [/local a] [a: 3 func-1]
    ]
    

    So, with that code loaded, if Rebol had lexical scoping, this is what you'd see:

    >> reduce [func-1 inner/func-2 inner/func-3]
    == [1 2 1]
    

    That would be because func-1 uses the a from the outer scope, the a used by func-2 is from the inner scope, and func-3 calls func-1, which still uses a from the outer scope where it was defined regardless of what's in func-3.

    If Rebol had dynamic scoping, this is what you'd see:

    >> reduce [func-1 inner/func-2 inner/func-3]
    == [1 2 3]
    

    That would be because func-3 redefines a, then calls func-1, which just uses the most recent active definition of a.

    Now for Rebol, you get that first result. But Rebol doesn't have lexical scoping. So why?

    Rebol fakes it. Here's how it works.

    In compiled languages, you have scopes. As the compiler goes through the file, it keeps track of the current scope, then when it sees a nested scope that becomes the current scope. For lexical scoping, the compiler keeps a reference to the outer scope, and then looks up words that weren't defined in the current scope by following the links to the outer scopes, until it finds the word, or doesn't. Dynamic-scoped languages do something similar, but at runtime, going up the call stack.

    Rebol doesn't do any of that; in particular it isn't compiled, it's built, at runtime. What you think of as code is actually data, blocks of words, numbers and such. The words are data structures that have a pointer in them called a "binding".

    When that script is first loaded all the words in the script are added to the environment object of the script (which we inappropriately call a "context", though it's not). While the words are being gathered, the script data is changed. Any word found in the script's "context" is linked to the "context", or "bound". Those bindings mean that you can just follow that one link and get to the object where the value of that word is stored. It's really fast.

    Then, once that's done, we start running the script. And then we get to this bit: func [] [a]. That is not really a declaration, that's a call to a function named func which takes a spec block and a code block and uses them to build a function. That function also gets its own environment object, but with words declared in the function's spec. In this case there are no words in the spec, so it's an empty object. Then the code block is bound to that object. But in this case there is no a in that object, so nothing is done to the a, it keeps the binding it already had from when it was bound before.

    Same goes for the context [...] call - yes, that's a call to a function inappropriately named context, which builds an object by calling make object!. The context function takes a block of data, and it searches for set-words (those things with the trailing colons, like a:), then builds an object with those words in it, then it binds all of the words in that block and all the nested blocks to the words that are in the object, in this case a, func-2 and func-3. And that means that the a's in that block of code have their bindings changed, to point to that object instead.

    When func-2 is defined, the binding of the a in its code block is not overridden. When func-3 is defined, it has an a in its spec, so the a: has its binding overridden.

    The funny thing about all of this is that there aren't any scopes at all. That first a: and the a in func-1's code body are only bound once, so they keep their first binding. The a: in inner's code block and the a in func-2's are bound twice, so they keep their second binding. The a: in func-3's code is bound three times, so it also keeps its last binding. It's not scopes, it's just code being bound and then smaller bits of code being bound again, and so on until it's done.

    Each round of binding is performed by a function that is "defining" something (really, building it), and then when that code runs and calls other functions that define something else, those functions perform another round of binding to its little subset of code. That's why we call it "definitional scoping"; while it really isn't scoping, it is what serves the purpose of scoping in Rebol, and it's close enough to the behavior of lexical scoping that on first glance you can't tell the difference.

    It really becomes different when you realize that these bindings are direct, and you can change them (sort of, you can make new words with the same name and a different binding). That same function that those definition functions call, you can call yourself: it's named bind. With bind you can break the illusion of scoping and make words that bind to any object you can get access to. You can do wonderful tricks with bind, even make your own definition functions. It's loads of fun!

    As for Red, Red is compilable, but it also includes a Rebol-like interpreter, binding and all of the goodies. When it's defining things with the interpreter it does definitional scoping as well.

    Does that help make things more clear?