Search code examples
javaantlrabstract-syntax-treecode-translation

ANTLR How to get rewrited code source? (using TokenRewriteStream)


I am trying to create simple translator translating something like:

aaa | bbb | ccc

to

1 : aaa
2 : bbb
c : ccc

Here is grammar test01.g:

grammar test01;

options {
    output=AST; 
}

@members{
  int N;
}

test 
@init{
  N = 0;
}:
  id ('|' id)* -> id (BR id)*;

id   : {N++;} ID  -> {new CommonTree(new CommonToken(ID, Integer.toString(N) + " : "  + $ID.text))};
ID   : ('a'..'z')+;
BR   : '\n';
WS   : ' '{$channel=HIDDEN;};

Translator source FooTest.java:

import org.antlr.runtime.*;

class FooTest {
  public static void main(String[] args) throws Exception {    
    String text = "aaa | bbb | ccc";        
    System.out.println("parsing: "+text);        
    ANTLRStringStream in = new ANTLRStringStream(text);
    test01Lexer lexer = new test01Lexer(in);
    CommonTokenStream tokens = new TokenRewriteStream(lexer);
    test01Parser parser = new test01Parser(tokens);
    parser.test();
    System.out.println("Result: "+tokens.toString());    
  }
}

When i run it, i excpect to get something like:

parsing: aaa | bbb | ccc
Result:
 1 : aaa
 2 : bbb 
 3 : ccc

But i get:

  parsing: aaa | bbb | ccc
  Result:  aaa | bbb | ccc

Text seems to be unmodified.

How to get modified source?


Solution

  • You're simply printing the the flat list of tokens by doing:

    CommonTokenStream tokens = new TokenRewriteStream(lexer);
    // ...
    System.out.println("Result: "+tokens.toString());  
    

    If you adjust your FooTest class to:

    import org.antlr.runtime.*;
    import org.antlr.runtime.tree.*;
    
    class FooTest {
      public static void main(String[] args) throws Exception {    
        String text = "aaa | bbb | ccc";        
        System.out.println("parsing: "+text);        
        ANTLRStringStream in = new ANTLRStringStream(text);
        test01Lexer lexer = new test01Lexer(in);
        CommonTokenStream tokens = new TokenRewriteStream(lexer);
        test01Parser parser = new test01Parser(tokens);
        CommonTree root = (CommonTree)parser.test().getTree();
        for(int i = 0; i < root.getChildCount(); i++) {
          CommonTree child = (CommonTree)root.getChild(i);
          System.out.println("root.children[" + i + "] = " + child);
        }
      }
    }
    

    the following is printed to the console:

    parsing: aaa | bbb | ccc
    root.children[0] = 1 : aaa
    root.children[1] = BR
    root.children[2] = 2 : bbb
    root.children[3] = BR
    root.children[4] = 3 : ccc
    

    And note that you don't need to put a global variable in your parser class. Rules also handle variables (local to them). This is preferred:

    grammar test01;
    
    options {
        output=AST; 
    }
    
    test:
      id ('|' id)* -> id (BR id)*;
    
    id
    @init{
      int N = 0;
    }
      : {N++;} ID  -> {new CommonTree(new CommonToken(ID, Integer.toString(N) + " : "  + $ID.text))}
      ;
    
    // other rules