I'm writing a simple expression parser in Jison allowing an arbitrary number of newlines to follow a binary operator in an expression. This is my grammar so far:
{
"operators": [
["left", "+", "-"],
["left", "*", "/", "%"]
],
"bnf": {
"program": [
["statement EOF", "return $1;"]
],
"statement": [
["expression newlines", "$$ = $1 + ';';"]
],
"expression": [
["NUMBER", "$$ = yytext;"],
["expression + expression", "$$ = $1 + ' + ' + $3;"],
["expression - expression", "$$ = $1 + ' - ' + $3;"],
["expression * expression", "$$ = $1 + ' * ' + $3;"],
["expression / expression", "$$ = $1 + ' / ' + $3;"],
["expression % expression", "$$ = $1 + ' % ' + $3;"],
["expression + newlines expression", "$$ = $1 + ' + ' + $4;"],
["expression - newlines expression", "$$ = $1 + ' - ' + $4;"],
["expression * newlines expression", "$$ = $1 + ' * ' + $4;"],
["expression / newlines expression", "$$ = $1 + ' / ' + $4;"],
["expression % newlines expression", "$$ = $1 + ' % ' + $4;"]
],
"newlines": [
["NEWLINE", ""],
["newlines NEWLINE", ""]
]
}
}
As you can see I'm writing two rules for every binary operator. That seems to me to be very redundant. I would rather have a production which matches zero or more NEWLINE
tokens (Kleene star) instead of one or more tokens (Kleene plus). How would you do this in Jison?
I use Jison and I ignore white-space (including new-lines).
The first line in my %lex is:
\s+ /* ignore */
But you don't have to do it that way if you don't want to. Try something along these lines:
"expression": [
["NUMBER", "$$ = yytext;"],
["expression + expression", "$$ = $1 + ' + ' + $3;"],
["expression - expression", "$$ = $1 + ' - ' + $3;"],
["expression * expression", "$$ = $1 + ' * ' + $3;"],
["expression / expression", "$$ = $1 + ' / ' + $3;"],
["expression % expression", "$$ = $1 + ' % ' + $3;"],
["expression newlines", "$$ = $1"],
["newlines expression", "$$ = $2"]
],
That should allow any amount of new lines before/after any expression.