Search code examples

LOOKAHEADs for the JavaScript/ECMAScript array literal production

I currently implementing a JavaScript/ECMAScript 5.1 parser with JavaCC and have problems with the ArrayLiteral production.

ArrayLiteral :
    [ Elision_opt ]
    [ ElementList ]
    [ ElementList , Elision_opt ]

ElementList :
    Elision_opt AssignmentExpression
    ElementList , Elision_opt AssignmentExpression

Elision :
    Elision ,

I have three questions, I'll ask them one by one.

This is the second one.

I have simplified this production to the following form:

    "[" ("," | AssignmentExpression ",") * AssignmentExpression ? "]"

Please see the first question on whether it is correct or not:

How to simplify JavaScript/ECMAScript array literal production?

Now I have tried to implement it in JavaCC as follows:

void ArrayLiteral() :
    |   AssignmentExpression()
    ) *
    ) ?

JavaCC complains about ambiguous , or AssignmentExpression (its contents). Obviously, a LOOKAHEAD specification is required. I have spent a lot of time trying to figure the LOOKAHEADs out, tried different things like

  • LOOKAHEAD (AssignmentExpression() ",") in (...)*
  • LOOKAHEAD (AssignmentExpression() "]") in (...)?

and a few other variations, but I could not get rid of the JavaCC warning.

I fail to understand why this does not work:

void ArrayLiteral() :
        LOOKAHEAD ("," | AssignmentExpression() ",")
    |   AssignmentExpression()
    ) *
        LOOKAHEAD (AssignmentExpression() "]")
    ) ?

Ok, AssignmentExpression() per se is ambiguous, but the trailing "," or "]" in LOOKAHEADs should make it clear which of the choices should be taken - or am I mistaken here?

What would a correct LOOKAHEAD specification for this production look like?


This did not work, unfortunately:

void ArrayLiteral() :
        LOOKAHEAD (AssignmentExpression() ",")
    ) *
    ) ?


Warning: Choice conflict in (...)* construct at line 6, column 5.
         Expansion nested within construct and expansion following construct
         have common prefixes, one of which is: "function"
         Consider using a lookahead of 2 or more for nested expansion.

Line 6 is ( before the first LOOKAHEAD. The common prefix "function" is simply one of the possible starts of AssignmentExpression.


  • Here is yet another approach. It has the advantage of identifying which commas indicate an undefined elements without using any semantic actions.

    void ArrayLiteral() : {} { "[" MoreArrayLiteral() }
    void MoreArrayLiteral() : {} {
    |    "," /* undefined item */ MoreArrayLiteral()
    |    AssignmentExpression() ( "]" |  "," MoreArrayLiteral() )