I have the following AST:
import org.antlr.v4.runtime.CommonToken;
import org.antlr.v4.runtime.Token;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
public class AST {
Token token; // from which token do we create this node?
List<AST> children; // normalized list of children
public AST() { ; } // for making nil-rooted nodes
public AST(Token token) { this.token = token; }
/** Create node from token type; used mainly for imaginary tokens */
public AST(int tokenType) { this.token = new CommonToken(tokenType); }
/** external visitors execute the same action for all nodes
* with same node type while walking
*/
public int getNodeType() { return token.getType(); }
public void addChild(AST t) {
if (children == null) children = new ArrayList<>();
children.add(t);
}
public void addChildren(AST... nodes) {
List<AST> asts = new ArrayList<AST>();
for (AST node : nodes){asts.add(node);}
for (int i = 0; i < asts.size(); i++) {
this.addChild(asts.get(i));
}
}
public List<AST> getChildren() { return children; }
/** to represent flat lists. A list is a subtree w/o a root, which we simulate
* with a nil root node. A nil node is a node with token == null.
*/
public boolean isNil() { return token == null; }
/** Compute string for single node */
public String toString() {
String typeName = CLUBParser.VOCABULARY.getSymbolicName(getNodeType());
typeName = typeName == null ? token.getText() : typeName;
return token != null ? "<" +typeName +", '" + token.getText() +"'>": "nil";
}
/** Compute string for a whole tree */
public String toStringTree() {
if (children == null || children.isEmpty()) return this.toString();
StringBuilder buf = new StringBuilder();
if (!isNil()) {
buf.append('(');
buf.append(this.toString());
buf.append(' ');
}
for (int i = 0; i < children.size(); i++) {
AST t = children.get(i); // normalized (unnamed) children
if (t != null) { // Null check
if (i > 0) buf.append(' ');
buf.append(t.toStringTree());
}
}
if (!isNil()) buf.append(')');
return buf.toString();
}
}
This Grammar:
grammar CLUB;
program : setup round turn funcs EOF ;
setup : SETUP LCURLY s=stmts RCURLY ;
round : ROUND LCURLY s=stmts RCURLY ;
turn : TURN '(' CLASSID VARID ')' '{' s=stmts '}'
| ;
funcs : func+
| ;
stmts : s1=stmt s2=stmts
| ;
func : FUNCID '(' tParams ')' '{' stmts '}'
| FUNCID '(' ')' '{' stmts '}' ;
tParams : tParam ',' tParams;
stmt : INTVAL ';' ;
iterStmt : 'while' '(' expr ')' '{' stmts '}'
| 'for' '(' decl ';' expr ';' postfExpr ')' '{' stmts '}' ;
selectStmt : 'if' '(' expr ')' '{' stmts '}' ;
exprStmt : expr ;
type : 'bool'
| 'int'
| 'string' ;
tParam : type VARID ;
params : param ',' params
|'['list']' ',' param
| param ;
list : OBJID ',' list
| OBJID ;
param : assignExpr ;
expr : assignExpr ;
assignExpr : logicOrExpr
| primaryExpr '=' assignExpr ;
logicOrExpr : logicAndExpr
| logicOrExpr '||' logicAndExpr ;
logicAndExpr : equalExpr
| logicAndExpr '&&' equalExpr ;
equalExpr : relatExpr
| equalExpr '==' relatExpr
| equalExpr '!=' relatExpr ;
relatExpr : addExpr
| relatExpr '<' addExpr
| relatExpr '>' addExpr
| relatExpr '<=' addExpr
| relatExpr '>=' addExpr ;
addExpr : multExpr
| addExpr '+' multExpr
| addExpr '-' multExpr ;
multExpr : unaryExpr
| multExpr '*' unaryExpr
| multExpr '/' unaryExpr ;
unaryExpr : postfExpr
| '-' postfExpr
| '!' postfExpr ;
postfExpr : primaryExpr
| primaryExpr '++'
| primaryExpr '--'
| primaryExpr '(' params ')'
| primaryExpr '(' ')'
| METHODID '(' params ')'
| METHODID '(' ')'
| primaryExpr '.' postfExpr ;
primaryExpr : val
| VARID
| OBJID
| FUNCID
| CLASSID ;
val : BOOLVAL
| INTVAL
| STRINGVAL ;
decl : tParam '=' logicOrExpr ;
AND : '&&' ;
OR : '||' ;
EQ : '==' ;
NOTEQ : '!=' ;
SMALLER : '<' ;
LARGER : '>' ;
SMALLEREQ : '<=' ;
BIGGEREQ : '>=' ;
INCREM : '++' ;
DECREM : '--' ;
PLUS : '+' ;
MINUS : '-' ;
MULT : '*' ;
DIVIDE : '/' ;
DOT : '.' ;
NOT : '!' ;
ASSIGN : '=' ;
COMMA : ',' ;
SEMI : ';' ;
LPAREN : '(' ;
RPAREN : ')' ;
LCURLY : '{' ;
RCURLY : '}' ;
LSQUARE : '[' ;
RSQUARE : ']' ;
WS: [ \t\n\r\f]+ -> skip ;
SETUP : 'Setup' ;
ROUND : 'Round' ;
TURN : 'Turn' ;
WHILE : 'while' ;
FOR : 'for' ;
IF : 'if' ;
BOOL : 'bool' ;
INT : 'int' ;
STRING : 'string' ;
METHODID : 'create'
|'getHand'
|'getHandPoints'
|'handSize'
|'printHand'
|'getTable'
|'getTablePoints'
|'printTable'
|'getDiscardPile'
|'draw'
|'discard'
|'takeCard'
|'layDown'
|'addJokers'
|'assignPoints'
|'shuffle'
|'getCard'
|'getCards'
|'print'
|'size'
|'returnCards'
|'returnDiscardPile'
|'flip'
|'getPoints' ;
BOOLVAL : 'true'
| 'false' ;
CLASSID : 'Player'
| 'Deck'
| 'Card'
| 'Table' ;
OBJID : [HDCS][2-9]
|[HDCS]'1'[0-3]?
|'J'[1-3]
|'player'[1-9][0-9]*
|'deck'
|'table' ;
INTVAL : '0'
| [1-9][0-9]* ;
VARID : [a-z][a-zA-Z_0-9]* ;
FUNCID : [A-Z][a-zA-Z_0-9]* ;
STRINGVAL : '"'~["]*'"' ;
And the following overwritten methods that work fine:
public class ClubBuildASTVisitor extends CLUBBaseVisitor<AST> {
// Start of tree
@Override
public AST visitProgram(CLUBParser.ProgramContext ctx) {
AST ast = new AST();
ast.addChild(visit(ctx.setup()));
ast.addChild(visit(ctx.round()));
ast.addChild(visit(ctx.turn()));
ast.addChild(visit(ctx.funcs()));
return ast;
}
@Override
public AST visitSetup(CLUBParser.SetupContext ctx) {
AST ast = new AST(ctx.SETUP().getSymbol());
ast.addChild(visit(ctx.stmts()));
return ast;
}
@Override
public AST visitRound(CLUBParser.RoundContext ctx) {
AST ast = new AST(ctx.ROUND().getSymbol());
AST stmtsAst = visit(ctx.stmts());
ast.addChild(stmtsAst);
return ast;
}
Im trying to create one for visitStmts. The problem is that the when I make one it keeps choosing the first child in the tree to become the root. An example would be:
@Override
public AST visitStmts(CLUBParser.StmtsContext ctx) {
AST ast = new AST(ctx.getStart());
int childCount = ctx.getChildCount();
for (int i = 0; i < childCount; i++) {
ParseTree child = ctx.getChild(i);
AST childAst = visit(child);
ast.addChild(childAst);
}
return ast;
}
How can i choose Round as my root and make the rest children?
ive tried replacing the parameters of the
@Override
public AST visitStmts(CLUBParser.StmtsContext ctx) {
AST ast = new AST(new CommonToken(CLUBParser.ROUND)); // Create AST with "ROUND" as the root token
if (ctx.stmt() != null) {
AST stmtChild = visit(ctx.stmt());
ast.addChild(stmtChild);
}
if (ctx.stmts() != null) {
AST stmtsChild = visit(ctx.stmts());
if (stmtsChild != null && !stmtsChild.isNil()) {
for (AST child : stmtsChild.getChildren()) {
ast.addChild(child);
}
}
}
return ast;
}
But that ends up with the output looking like this:
(<SETUP, 'Setup'> (<ROUND, 'null'> )) (<ROUND, 'Round'> (<ROUND, 'null'> ))
I need the AST for type checking but ive read its also possible to do so with the parse tree built into antlr4?
This construct:
stmts : s1=stmt s2=stmts
| ;
stmt : INTVAL ';' ;
Will create a parse tree that is n nodes deep where n is the number of stmt
s, and each level will have its own stmts
node. This will make it a bit cumbersome to navigate to the "parent" of the top level stmts
node.
This is not an uncommon structure in parser generators less expressive than ANTLR. However, in ANTLR you can get the same ability to parse "one or more stmt
s" as a stmts
rule, like so:
stmts: stmt+;
If you take a look at the Ctx class generated for this rule, you'll find that all of your stmt
s will be in a list in the stmts
Ctx. You won't have to trace up an extended tree to find the root stmts
's parent.
You don't provide sample input so it's not easy to provide images of the difference between the parse trees, but you can use the grun
alias, an ANTLR plugin for your IDE or Antlr Lab to see the difference in the two parse trees. Take a look at the parse tree with and without the changes, and it should be obvious that using +
to build the list produces a parse tree that's much more manageable.
This should make it much easier to gather all of the stmt
nodes that are the children of your stmts
node (and you'll only have a single stmts
node that just contains a list of the stmt
children).
Note: You may even find, that it simplifies things enough that you can change:
setup : SETUP LCURLY s=stmts RCURLY ;
round : ROUND LCURLY s=stmts RCURLY ;
to:
setup : SETUP LCURLY s=stmt+ RCURLY ;
round : ROUND LCURLY s=stmt+ RCURLY ;
and drop the stmts
rule entirely.