I heard that most compilers use AST
, then translate it to IR
(Intermediate representation).
But I think the compiler can generate IR
directly, like C4 project.
If I use AST, when I finish syntax analysis and semantic analysis, I have to scan the AST from the beginning to generate IR. This is an extra step, so I think it is slow.
What are the benefits of using AST? Better readability or better portability?
Could you give me some advices? Thank you for your time.
You're likely need more than one AST. Your first AST, the one produced by parser, is likely full of redundant things, of all the syntax sugar that makes your source language easy to use. Before you start generating IR you need to remove this redundancy, otherwise your code generation step will become a repetitive boilerplate.
A case in point - if
statement. You have two forms of it - one with only a true
branch, and another with both true
and false
branches. The former is a special case of the latter, so it makes sense to do a pass over your AST, replacing all the one-armed if
statements with 2-armed ones with a dummy false
branch. Then your IR generation procedure will only have to deal with one kind of if
.
Another important consideration is typing. It's easier to do on a tree rather than some flat IR, for a vast majority of type systems out there.
Also, consider your flat IR to be just another form of AST and treat it the same way. Converting a complex AST into some low and simple backend AST (or IR, call it whatever you like) in small steps is much easier than doing everything in one huge boilerplate pass.