I want to create a grammar rule for a simplified version of a block statement in C, which matches a list of statements in braces with optional newlines at the beginning and end. Statements in this language are terminated by a newline character.
The optional newlines are so that block statements can span multiple as well as single lines. i.e, both
{ statement }
and
{
statement
}
should be supported.
Currently my rules are as follows:
BlockStmt:
'{' OptionalNewlines BlockStmtList OptionalNewlines '}';
OptionalNewlines:
OptionalNewlines '\n'
| %empty;
Empty blocks are also supported, which are basically blocks with just newlines in them and no statements. This is possible because BlockStmtList can reduce to %empty.
However for empty blocks with just newlines, this leads to a shift-reduce conflict as the newlines can be matched by both the beginning and the ending OptionalNewlines non-terminal.
How do I tell yacc to prioritise one of the OptionalNewlines in the case of an empty block with just newlines?
Unless you have a problem with blank lines in the middle of a block -- something which many of us like to do -- the simple solution is to just allow empty statements (that is, a statement consisting only of the newline terminator. If you do that, you can stop worrying about optional newlines and just use
BlockStmt: '{' BlockStmtList '}';
That's far and away the easiest. But if it doesn't work for you, read on.
In general, you cannot have a sequence of optional lists where two of the lists have the same elements. That leads to an ambiguity: if your grammar allows a* b* a*
(using Kleene * for simplicity) and the input is a
, there is no way to know whether the empty a*
is before or after the empty b*
. "Optional" elements are problematic in many situations; it's often necessary to expand empty non-terminals into multiple rules using non-optional elements:
BlockStmt: '{' '}'
| '{' NewLineList '}'
| '{' NewLineList StmtList OptionalNewLineList '}'
| '{' StmtList OptionalNewLineList '}'