Search code examples
data-structurestreecompiler-construction

Data structure for program scope?


I'm trying to parse through an AST of a program for a made up language, to be specific I'm trying to emulate the scope, so you enter a function for example and you push a new scope, and when the function is finished being visited by the visitor, it pops the scope. One important aspect is that when we push a new scope, there is a pointer currentScope that is set, which points to the scope we're currently looking at. When we pop the scope, this currentScope is set to be the "outer":

class Scope:
    outer : Scope
    inner : Scope

This is going to happen in multiple passes, but the first pass it's important that it constructs the general tree of scopes. The question I'm asking though is how can I traverse this tree in the same order it was created? For example:

{ // global scope
    { // a
        { // aa

        }
        { // ab
        }
    }
    { // b
    }
}

When I pass over the exact same set of nodes again, in theory they will give me the same tree of scope, but I want to preserve all of the data we collect and store each scope over each pass. In other words, when the second or third pass happens over the AST, when we visit a, currentScope = a, and when we visit aa, then currentScope = aa. Is this possible? I'm really confused with this idea, the whole recursive-y aspect is really messing with my head and I can't seem to figure out how to do this.

Here's what I've tried:

class Scope
    outer : Scope
    inner : Scope
    siblings : []Scope

    Scope(outer):
        this.outer = outer

push_idx = 0

push_scope()
    // set global scope
    if current is null
        global = new Scope(null)
        current = global
        return

    if current.inner is not null:
        // first pass over the AST
        if current_pass == 0:
            new_scope = new Scope(current)
            current.siblings.push(new_scope)
            current = new_scope
            return
        current = current.siblings[push_idx++]
    else:
        new_scope = new Scope(current)
        current.inner = new_scope
        current = current.inner

pop_scope()
    push_idx = 0
    current = current.outer

Though the order doesn't seem correct, and I'm fairly certain this is the wrong approach to this.


Solution

  • A data structure that's often used to track scope inside a compiler is a spaghetti stack, which is essentially a linked list data structure where each scope is a node storing a pointer to its parent scope. Whenever you enter a scope, you create a new node, point it to the enclosing scope, then store the node somewhere in the AST associated with that scope. As you walk the AST, your AST walker stores a pointer to the current scope node. When you enter a scope, you create a new scope node as described above. When you leave a scope, you change the pointer to point to the parent of the current scope. This ends up building a large inverted tree structure where each scope can trace its scope chain up to the root scope - the spaghetti stack.