I'm exceptionally new to jison and have managed to piece together a helpful query parser. I'm now trying to create a parser that can parse a string like "a == 1 and b == 1 and c == 1" into an object like
{and: [
{a: {eq: 1}},
{b: {eq: 1}},
{c: {eq: 2}}
]}
while a string like "a == 1 or b == 1 and c == 1" should parse into an object like
{or: [
{a: {eq: 1}},
{and: [
{b: {eq: 1}},
{c: {eq: 1}}
]}
]}
My grammar looks like this so far:
%lex
%%
\s+ /*skip whitespace*/
\"(\\.|[^"])*\" yytext = yytext.substr(1, yyleng-2); return 'STRING';
"==" return '==';
and[^\w] return 'and';
or[^\w] return 'or';
[0-9]+(?:\.[0-9]+)?\b return 'NUMBER';
[a-zA-Z][\.a-zA-Z0-9_]* return 'SYMBOL';
<<EOF>> return 'EOF';
/lex
%left 'or'
%left 'and'
%left '=='
%start expressions
%%
expressions
: e EOF
{$$ = $1;}
;
e
: property '==' value
{ $$ = {}; $[$1] = {eq: $3}; }
| boolAnd
{ $$ = {and: $1}}
| boolOr
{ $$ = {or: $1}}
;
boolAnd
: boolAnd 'and' e
{$$ = $1; $1.push($3);}
| e 'and' e
{$$ = [$1, $3];}
;
boolOr
: boolOr 'or' e
{$$ = $1; $1.push($3);}
| e 'or' e
{$$ = [$1, $3];}
;
property
: SYMBOL
{$$ = $1;}
;
value
: NUMBER
{$$ = Number(yytext);}
| STRING
{$$ = yytext; }
;
and it gives me the following conflict errors:
Conflict in grammar: multiple actions possible when lookahead token is and in state 4
- reduce by rule: e -> boolAnd
- shift token (then go to state 11)
Conflict in grammar: multiple actions possible when lookahead token is or in state 5
- reduce by rule: e -> boolOr
- shift token (then go to state 12)
States with conflicts:
State 4
e -> boolAnd . #lookaheads= EOF and or
boolAnd -> boolAnd .and e #lookaheads= EOF and or
State 5
e -> boolOr . #lookaheads= EOF and or
boolOr -> boolOr .or e #lookaheads= EOF or and
Is anyone able to offer a suggestion on what I am doing wrong? Many thanks
Since
e : boolAnd
It is impossible to decide between:
boolAnd: e 'and' e
| boolAnd 'and' e
which is what jison is complaining about. (And it's worth noting that the reduction of boolAnd
to e
does not seem to be what you want. It's really a type error, or would be if JS had types.)
Personally, I'd just use binary trees; in my experience, they turn out to be easier to work with. You can do that easily with a single non-terminal and precedence declarations.
%left 'or'
%left 'and'
%%
e
: property '==' value
{ $$ = {eq: [$1, $3]}; /* This seems better to me. YMMV. }
| e 'and' e
{ $$ = {and: [$1, $3]}}
| e 'or' e
{ $$ = {or: [$1, $3]}}
| '(' e ')'
{ $$ = $2 /* You didn't have this one, but it seems useful */ }
;
It is possible to make a grammar which will handle variadic operators (i.e.. reduce x OP y OP z
to {OP: [x, y, z]}
) but it's actually quite a bit of work to get it right, and it does not yield easily to a solution based on precedence declarations. Unless you really want to distinguish between x OP y OP z
and x OP (y OP z)
, which is unnecessary in the case of boolean operators, it is usually easier and more general to collapse multiple similar binary operators in a second pass over the parse tree, or directly when creating the binary node, by examining the operator types of the child expressions.