Search code examples
javaparsingcompiler-constructionabstract-syntax-treejavacc

How Do I Build an Abstract Syntax Tree with JJTree?


When building an AST and adding children to the tree, what is the difference between:

void NonTerminal #Nonterminal: { Token t;}
{
    t = <MULTIPLY> OtherNonTerminal() {jjtThis.value = t.image;} #Multiply
}

and:

void NonTerminal : { Token t;}
{
    t = <MULTIPLY> OtherNonTerminal() {jjtThis.value = t.image;} #Multiply(2)
}

Note:

<MULTIPLY : "*">

Are there any major differences and will both of these work the same way?

Also would another way of building the tree for this production rule:

void NonTerminal() : { Token t; }
{
    t = <MULTIPLY> OtherNonTerminal() { jjtThis.value = t.image; } #Mult(2)
|   t = <DIVIDE> OtherNonTerminal() { jjtThis.value = t.image; } #Div(2)
|   {}
}

be like this:

void NonTerminal() #Nonterminal(2) : { Token t; }
{
    (t = <MULTIPLY> OtherNonTerminal() | t = <DIVIDE> OtherNonTerminal() | {}) {jjtThis.value = t.image;}
}

Solution

  • In the first case

    void NonTerminal #Nonterminal: { Token t;}
    {
        t = <MULTIPLY>
        OtherNonTerminal() {jjtThis.value = t.image;}
        #Multiply
    }
    

    the Multiply node will have as children all the nodes pushed on the stack during its node scope, excluding any that are popped before the end of the scope. In this case that means all the nodes pushed and not popped during the parsing of OtherNonTerminal.

    In the second example

    void NonTerminal #void : { Token t;}
    {
        t = <MULTIPLY>
        OtherNonTerminal() {jjtThis.value = t.image;} 
        #Multiply(2)
    }
    

    the Multiply node will get the two top nodes from the stack as its children.

    So probably there is a difference.

    The other difference is that the second example doesn't specify a node associated with Nonterminal.

    In the first case, this tree will be pushed

            Nonterminal
                 |
              Multiply
                  |
    All nodes pushed (but not popped) during the parsing of OtherNonterminal
    

    In the second case, the parsing of OtherNonterminal will do its thing (popping and pushing nodes), then two nodes will the popped and this tree will be pushed

         Multiply
          |     |
      A child  Another child
    

    For the second question. The difference between

    void NonTerminal() #void : { Token t; }
    {
        t = <MULTIPLY>
        OtherNonTerminal()
        { jjtThis.value = t.image; }
        #Mult(2)
    |
        t = <DIVIDE>
        OtherNonTerminal()
        { jjtThis.value = t.image; }
        #Div(2)
    |
        {}
    }
    

    and

    void NonTerminal() #Nonterminal(2) : {
        Token t; }
    {
        ( t = <MULTIPLY> OtherNonTerminal()
        | t = <DIVIDE> OtherNonTerminal()
        | {}
        )
        {jjtThis.value = t.image;}
    }
    

    is that the first does not build a node when the empty sequence is matched.

    Consider the second way in the the case where the next token is something other than * or /. You'll will get

          Nonterminal
          /        \
      Some node    Some other node
      don't want   you don't want
    

    I'm actually surprised that the second one even gets past the Java compiler, since the reference to t is a potentially uninitialized variable.