oop compiler-errors compiler-construction compiler-optimization type-inference

How to handle forward references with type inference

Making a compiler for oop language, with the same language. The compiler currently traverse the ast 4 times, first two are used for resolving type chain, the 3rd is used to populate symbol table for future lookup and final is to resolve the symbols. During the resolution I am also trying to do type inference for resolving type for variable symbols.

Here is the sample code I am using to test the compiler.

var c = a.name;

var a = TestClass(1);

var d = a.test(1);

class TestClass<E> {
  TestClass(E name) : this.name = name;

  TestClass.create(this.name);

  E name;

  E test(E element) {
    return element;
  }

}

In the above code c is dependent on a for inferring its own type but a also need type inference. So, when the compiler tries to look up a when resolving the type for c it gets Unknown and throws a error.

this is my first time making a compiler, I am out of ideas on how to solve this problem.

Solution

Solving the problem

You would need to generate a dependency map of the (unknown) types as the first type inference pass. As a data structure, the map can be either a set of pairs or an associative array of sets. The former is used here for illustrative purposes. For the provided example, we would have a dependency map as follows:

c -> a.name 
a.name -> a
a -> TestClass(1)
TestClass(1) -> TestClass
d -> a.test(1)
a.test(1) -> a.test
a.test -> a

Note that unknowns may depend on more than one other unknown. For example, if we had something along the lines of var a = b + c, then we would have both a -> b and a -> c as dependencies.

I have kept the numeric literal 1 since I am not aware whether Dart allows templates to have constants along the lines of C++ template <unsigned i>. If it does not, we could use the type of a numeric literal instead (e.g. int, Number or Literal<int>) to make analysis faster.

We do not add the implicit template parameters since we do not know which of the calls are made to template functions/classes. If the template parameter is explicitly provided for initialization of a, we would have the following dependencies for a instead.

a -> TestClass<int>(1)
TestClass<int>(1) -> TestClass<int>
TestClass<int> -> TestClass

We can then take all of the unknowns in the dependency map and sort the dependents after the corresponding dependencies. After sorting, we can resolve the types for each of the unknowns without running into issues. For the example, we would get the following list after sorting:

TestClass
TestClass(1)
a
a.test
a.test(1)
d
a.name
c

Improving performance

To reduce the computational cost of the sorting and to detect and report/resolve circular dependencies, we can compute the full sets of dependencies for each unknown before sorting. To do this, we repeatedly expand every dependency in the dependency map until the dependency map no longer changes; this procedure is sometimes called "closure" in compiler-related literature. As pseudocode, the procedure can be expressed as:

let set: set of (T, T); # our set of dependencies
let unknowns: set of T; # the set of all the unknowns
let prevSize = 0;

while prevSize < set.size: # continue while set is growing
    prevSize = set.size;
    for every (dependent, dependency) in set: # b -> c
        for every unknown in unknowns:
            if (unknown, dependent) in set: # if a -> b, then a -> c
                add (unknown, dependency) to set

The expansion is completed only after the AST has been fully traversed, although we could do partial updates at each step where a new dependency is added to the dependency map. In the final set, circular dependencies will appear in the form of unknown -> unknown. To illustrate the resulting set, here is a partial result for the provided example (which has no circular dependencies):

c -> a.name
c -> a
c -> TestClass(1)
c -> TestClass
...
d -> a.test(1)
d -> a.test
d -> a
d -> TestClass(1)
d -> TestClass
a.test(1) -> a.test
a.test(1) -> a
a.test(1) -> TestClass(1)
a.test(1) -> TestClass
...