Search code examples
javacc

What does the lookahead do, and why it takes two tokens


I have this :

void Identifier() : { Token t;}
{
t = <IDENTIFIER> {jjtThis.setValue(t.image);}
}

void AssignStatement() : { Token t;}
{
(
    LOOKAHEAD(2) Identifier() t = <ASSIGN>
    {
        jjtThis.addOp(t.image);
    }
)+ Expression()
}

Expression() calls the comparison and then the comparison() calls logicExpression() etc... (ordered with operators priorities) My tokens are for assign operator, arithmetic and logic operators, comparison, semicolon, if, while, for statement etc... I know that LOOKAHEAD(2) looks at the next two tokens to decide which rule to choose, but in my case I don't understand it.


Solution

  • When the parse need to make a choice, it usually does so on the basis of the next token of input. In this case it needs to make the choice of staying in the loop or leaving the loop. But the usual one token is not enough, since if the next token is <IDENTIFIER>, that could be the start of an Identifier or the start of an Expression. However, 2 tokens of look ahead is sufficient: In the next two tokens are <IDENTIFIER> <ASSIGN>, it should stay in the loop, since cant be the start of an Expression. And while and expression might be just an <IDENTIFIER> an <ASSIGN> can't follow an AssignStatement.

    I.e. the generated code for AssignStatement look something like this, ignoring the tree building code.

    call Identifier
    if the next token is not an <ASSIGN> error
    read the next token, call it t
    while the next two tokens are <IDENTIFIER> <ASSIGN>
        call Identifier
        if the next token is not an <ASSIGN> error
        read the next token, call it t
    call Expression