I'm devising a very simple grammar, where I use the unary minus operand. However, I get a shift/reduce conflict. In the Bison manual, and everywhere else I look, it says that I should define a new token and give it higher precedence than the binary minus operand, and then use "%prec TOKEN" in the rule.
I've done that, but I still get the warning. Why?
I'm using bison (GNU Bison) 2.4.1. The grammar is shown below:
#include <string>
extern "C" int yylex(void);
%union {
std::string token;
%token <token> T_IDENTIFIER T_NUMBER
%right T_EQUAL
%left T_MUL T_DIV
%left UNARY
%start program
program : statements expr
statements : '\n'
| statements line
line : assignment
| expr
assignment : T_IDENTIFIER T_EQUAL expr
expr : T_NUMBER
| expr T_PLUS expr
| expr T_MINUS expr
| expr T_MUL expr
| expr T_DIV expr
| T_MINUS expr %prec UNARY
doesn't do as much as you might hope here. It tells Bison that in a situation where you have - a * b
you want to parse this as (- a) * b
instead of - (a * b)
. In other words, here it will prefer the UNARY
rule over the T_MUL
rule. In either case, you can be certain that the UNARY
rule will get applied eventually, and it is only a question of the order in which the input gets reduced to the unary argument.
In your grammar, things are very much different. Any sequence of line
non-terminals will make up a sequence
, and there is nothing to say that a line
non-terminal must end at an end-of-line. In fact, any expression can be a line
. So here are basically two ways to parse a - b
: either as a single line with a binary minus, or as two “lines”, the second starting with a unary minus. There is nothing to decide which of these rules will apply, so the rule-based precedence won't work here yet.
Your solution is correcting your line splitting, by requiring every line
to actually end with or be followed by an end-of-line symbol.
If you really want the behaviour your grammar indicates with respect to line endings, you'd need two separate non-terminals for expressions which can and which cannot start with a T_MINUS
. You'd have to propagate this up the tree: the first line
may start with a unary minus, but subsequent ones must not. Inside a parenthesis, starting with a minus would be all right again.