Let me put the question first: Can I convert a parse tree implementing this particular grammar to an AST trivially.
I was given this grammar to build a parse tree:
literal := INTEGER | FLOAT | TRUE | FALSE .
designator := IDENTIFIER { "[" expression0 "]" } .
op0 := ">=" | "<=" | "!=" | "==" | ">" | "<" .
op1 := "+" | "-" | "or" .
op2 := "*" | "/" | "and" .
expression0 := expression1 [ op0 expression1 ] .
expression1 := expression2 { op1 expression2 } .
expression2 := expression3 { op2 expression3 } .
expression3 := "not" expression3
| "(" expression0 ")"
| designator
| call-expression
| literal .
For this particular example:
func main() : void {
let a = 1 + 2 + 3 + 4;
}
My parser will generate (part of) the parse tree as such
EXPRESSION1
EXPRESSION2
EXPRESSION3
LITERAL
INTEGER(1)(lineNum:2, charPos:10)
OP1
ADD(lineNum:2, charPos:12)
EXPRESSION2
EXPRESSION3
LITERAL
INTEGER(2)(lineNum:2, charPos:14)
OP1
ADD(lineNum:2, charPos:16)
EXPRESSION2
EXPRESSION3
LITERAL
INTEGER(3)(lineNum:2, charPos:18)
OP1
ADD(lineNum:2, charPos:20)
EXPRESSION2
EXPRESSION3
LITERAL
INTEGER(4)(lineNum:2, charPos:22)
Just notice how these tree branches under EXPRESSION1 go:
EXPRESSION2 + EXPRESSION2 + EXPRESSION2 + EXPRESSION2
which the operator + doesn't correspond to its two operands. So it seems to me that, in the AST conversion, I can't get an AST that aids 3-address IR code generation by simply pulling up the operator to replace the non-terminal EXPRESSION1.
To achieve this goal, the grammar I would have written for this language will be like this instead
expression1 := expression2 | expression1 + expression2 (1)
expression2 := expression3 | expression2 * expression3 (2)
expression3 := literal (3)
which the branches under EXPRESSION1 are only
EXPRESSION1 + EXPRESSION2
However, this grammar is not LL(1) since |FIRST(expression2)| = |{literal, +}| > 1.
It begs the question that (1) what would be the most elegant and trivial way to convert this parse tree? (2) is my construction of the parse tree a complete waste of time for this grammar that I should have started out coding an AST instead?
I believe you wish to produce an AST like:
ADD
/ \
1 ADD
/ \
2 ADD
/ \
3 4
But probably you noticed that your parse tree is actually a flat list not representing an easy starting point to group operators and their operands in single pass. Anyway coding a more advanced parser, recognizing tree structure and applying grammar reductions is not a trivial task.
Hence before starting doing that you might wish to consider some existing parser generators like yacc or ANTLR. Probably you will need to rewrite your grammar to create rules centered on the operators instead of treating them as a recursion utility. A grammar not being a classic LL(1) might not be a big problem as modern generators (like ANTLR) can handle LL(k) grammars with bigger lookahead, predicates etc. and just bypass issues of that type.
If you still insist on coding it manually think about using a stack to store symbols and converting them to an AST node once an expression is collected but again it is not a simple task.