I have the following grammar and I want to parse inputs to get associated ASTs. Everything is easy with ANTLR for Java. Since ANTLR4, in grammar files, you don't have to specify options `output=AST; to get ASTs information.
Hello.g
grammar Hello; // Define a grammar called Hello
stat : expr NEWLINE
| ID '=' expr NEWLINE
| NEWLINE
| expr
;
expr: atom (op atom)* ;
op : '+'|'-' ;
atom : INT | ID;
ID : [a-zA-Z]+ ;
INT : [0-9]+ ;
NEWLINE : '\r' ? '\n' ;
WS : [ \t\r\n]+ -> skip ;
Test.java
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
import java.io.*;
import lib.HelloLexer;
import lib.HelloParser;
public class Test {
public static void main(String[] args) throws Exception {
ANTLRInputStream input = new ANTLRInputStream("5 + 3");
// create a lexer that feeds off of input CharStream
HelloLexer lexer = new HelloLexer(input);
// create a buffer of tokens pulled from the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// create a parser that feeds off the tokens buffer
HelloParser parser = new HelloParser(tokens);
ParseTree tree = parser.expr(); // begin parsing at init rule
//System.out(tree.toStringTree(parser)); // print LISP-style tree
System.out.println(tree.toStringTree(parser));
}
}
The output will be:
(expr (atom 5) (op +) (atom 3))
But would you please tell me how to obtain the same result with Python implementation? Currently, I'm using ANTLR 3.1.3 Runtime for Python. The following code only returns "(+ 5 3)"
Test.py
import sys
import antlr3
import antlr3.tree
from antlr3.tree import Tree
from HelloLexer import *
from HelloParser import *
char_stream = antlr3.ANTLRStringStream('5 + 3')
lexer = ExprLexer(char_stream)
tokens = antlr3.CommonTokenStream(lexer)
parser = ExprParser(tokens)
r = parser.stat()
print r.tree.toStringTree()
There is an antlr4 runtime for Python now (https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Python+Target) but toStringTree is a class method in the Python runtimes. You can call it like this to get the lisp style parse tree including stringified tokens:
from antlr4 import *
from antlr4.tree.Trees import Trees
# import your parser & lexer here
# setup your lexer, stream, parser and tree like normal
print(Trees.toStringTree(tree, None, parser))
# the None is an optional rule names list