I currently implementing a JavaScript/ECMAScript 5.1 parser with JavaCC and have problems with the ArrayLiteral production.
ArrayLiteral :
[ Elision_opt ]
[ ElementList ]
[ ElementList , Elision_opt ]
ElementList :
Elision_opt AssignmentExpression
ElementList , Elision_opt AssignmentExpression
Elision :
,
Elision ,
I have three questions, I'll ask them one by one.
This is the second one.
I have simplified this production to the following form:
ArrayLiteral:
"[" ("," | AssignmentExpression ",") * AssignmentExpression ? "]"
Please see the first question on whether it is correct or not:
How to simplify JavaScript/ECMAScript array literal production?
Now I have tried to implement it in JavaCC as follows:
void ArrayLiteral() :
{
}
{
"["
(
","
| AssignmentExpression()
","
) *
(
AssignmentExpression()
) ?
"]"
}
JavaCC complains about ambiguous ,
or AssignmentExpression
(its contents). Obviously, a LOOKAHEAD
specification is required. I have spent a lot of time trying to figure the LOOKAHEAD
s out, tried different things like
LOOKAHEAD (AssignmentExpression() ",")
in (...)*
LOOKAHEAD (AssignmentExpression() "]")
in (...)?
and a few other variations, but I could not get rid of the JavaCC warning.
I fail to understand why this does not work:
void ArrayLiteral() :
{
}
{
"["
(
LOOKAHEAD ("," | AssignmentExpression() ",")
","
| AssignmentExpression()
","
) *
(
LOOKAHEAD (AssignmentExpression() "]")
AssignmentExpression()
) ?
"]"
}
Ok, AssignmentExpression()
per se is ambiguous, but the trailing ","
or "]"
in LOOKAHEAD
s should make it clear which of the choices should be taken - or am I mistaken here?
What would a correct LOOKAHEAD
specification for this production look like?
Update
This did not work, unfortunately:
void ArrayLiteral() :
{
}
{
"["
(
","
|
LOOKAHEAD (AssignmentExpression() ",")
AssignmentExpression()
","
) *
(
AssignmentExpression()
) ?
"]"
}
Warning:
Warning: Choice conflict in (...)* construct at line 6, column 5.
Expansion nested within construct and expansion following construct
have common prefixes, one of which is: "function"
Consider using a lookahead of 2 or more for nested expansion.
Line 6 is (
before the first LOOKAHEAD
. The common prefix "function"
is simply one of the possible starts of AssignmentExpression
.
Here is yet another approach. It has the advantage of identifying which commas indicate an undefined elements without using any semantic actions.
void ArrayLiteral() : {} { "[" MoreArrayLiteral() }
void MoreArrayLiteral() : {} {
"]"
| "," /* undefined item */ MoreArrayLiteral()
| AssignmentExpression() ( "]" | "," MoreArrayLiteral() )
}