Search code examples
javacompiler-constructionabstract-syntax-treejavacc

How to reproduce the original code using an AST from JJTree


I have been tasked with an assignment where I must use a JavaCC parser to make a compiler. I have a grammar for a made up language, which we will call K. Given an input program, I have to be able to read in that program, create an AST, and then traverse that AST to reproduce the original program in a more cleaned up form.

For example, given the code:

begin a := 2
s := 0 while - a
12 begin
    s := + s * a a a := + a 2
end end

Once I feed this in to my program I will get out an AST comprised of elements like Op, Const, ID, etc...

But I need to be able to get the actual numbers and variables used in the code, so that I can reproduce the code like so:

begin
  a := 2
  s := 0
  while - a 12
    begin
      s := + s * a a
      a := + a 2
    end
end

I have read through the example here which shows how to make an AST, and as far as I can tell I have this working. What I am confused on is how to get the actual text that generated the nodes back out of the AST. The person in this question used the dump method, but this only gets you back the type of the node. I just need an idea of how to get the actual identifiers from the nodes as I iterate through them.

I would really appreciate some advice here.


Solution

  • In the SimpleNode class there is a field called value and you can use that for whatever you want. In Bart Kiers's answer to this question you can see that he used the value field to store information like identifier names and constant.

    E.g., he writes

    void id() #ID :
    {Token t;}
    {
      t=<ID> {jjtThis.value = t.image;}
    }
    

    which means that any node with .id equal to ID will contain the identifier (as a String) in its value field.

    To traverse the tree to reproduce the input, you can either write a single big recursive method to traverse the tree, or you can use visitors.

    Here is what the single big method might look like

    static void buildString( SimpleNode n, String indentation, StringBuilder out ) {
        swtich( n.id ) {
            ...
            case ID: out.append( n.jjtGetValue() ) ; break ;
            ...
        }
    }
    

    Since the id field is protected, the above won't compile. You could do one of the following:

    • Change id to public. (Better yet add a public accessor and use that.)
    • Put the buildString method in the SimpleNode class
    • Subclass SimpleNode with your own class that has an accessor for id and ensure that the parser uses that class by using the NODE_CLASS option. Also change the type of buildString's parameter.