I am currently implementing generic functions for my own language, but I got stuck and currently have the following problem:
Generic functions can get called from another source file (another parser instance). Let's assume we have a generic function in source file B and we call it from source file A, which imports source file B. When this happens, I need to type-check the body of the function (source file B) once again for every distinct manifestation of concrete types, derived from the function call (source file A). For that, I need to visit the body of the function in source file B potentially multiple times.
Source file B:
type T dyn;
public p printFormat<T>(T element) {
printf("Test");
}
Source file A:
import "source-b" as b;
f<int> main() {
b.printFormat<double>(1.123);
b.printFormat<int>(543);
b.printFormat<string[]>({"Hello", "World"});
}
I tried to realize that approach by putting the code for analyzing the function body and its children in an inner function and call it every time I encounter a call to that particular function from anywhere (also from other source files). This seems not to work for some reason. I always get a segmentation fault. Maybe this is because the whole tree was already visited once?
For additional context: C++ source code of my visitor
Would appreciate some useful answers or tips, thank you! ;)
I don't think the best approach is to hack around with parsers. Parsers should turn one array of characters into one AST.
In your case, you've got a fairly complex but new language, using multiple files. When you import
B, you really want to import the AST. C++ historically messed with a literal #include
and the parsing problems that brings, and only now is getting modules. Languages like Java did away with this textual inclusion, but retrofitted generics later on. You've got a clean slate. You should design your language such that the compiler can just take a bunch of AST's as its input.
Since the compiler will take AST's as input, each AST will be read-only. You can of course have a cache for instantiations so you don't need to re-instantiate printFormat<int>
every time you encounter it in an AST, but that's a detail.
What's not an detail is how instantiation should work in your language. A common mistake is the assumption that C++ templates work like macro's, at text level. That's not the case; they work at the language level. Yours should work also at the language level. It would be really convenient for you if instantiation took an AST (or at least a subtree thereof) and would then produce a new AST for the instantiation, again read-only. It's no coincidence that the C++ template meta-language is effectively a functional language. These kinds of problems become much easier the more you can make read-only.