I'm building a grammar in bison and I have a r/r conflict (which I know where it is), but I don't know how to fix it. I would appreciate any possible help.
The part of my code that includes the conflict is:
orismos2: %empty
|orismos orismos2
|error {yyerrok;yyclearin;};
orismos: orismosmetablitwn
|orismossunartisis
|prwtotuposunartisis;
orismosmetablitwn: tuposdedomenwn listametablitwn SEMICOLON ;
tuposdedomenwn: INT
|BOOL
|STRING;
listametablitwn: ID nid ;
nid: %empty
|pid nid
|error {yyerrok;yyclearin;};
pid: COMMA ID ;
orismossunartisis: kefalidasunartisis tmimaorismwn tmimaentolwn;
prwtotuposunartisis: kefalidasunartisis SEMICOLON;
kefalidasunartisis: typos_synartisis ID OPENBRACKET c CLOSEBRACKET;
typos_synartisis: INT
|BOOL
|VOID;
I have make an output file, in which I can see all the conflicts.
The part of the file that includes the conflicts is:
State 21 conflicts: 1 reduce/reduce
State 22 conflicts: 1 reduce/reduce
Grammar
10 orismos2: %empty
11 | orismos orismos2
12 | error
13 orismos: orismosmetablitwn
14 | orismossunartisis
15 | prwtotuposunartisis
16 orismosmetablitwn: tuposdedomenwn listametablitwn SEMICOLON
17 tuposdedomenwn: INT
18 | BOOL
19 | STRING
20 listametablitwn: ID nid
21 nid: %empty
22 | pid nid
23 | error
24 pid: COMMA ID
25 orismossunartisis: kefalidasunartisis tmimaorismwn tmimaentolwn
26 prwtotuposunartisis: kefalidasunartisis SEMICOLON
27 kefalidasunartisis: typos_synartisis ID OPENBRACKET c CLOSEBRACKET
28 typos_synartisis: INT
29 | BOOL
30 | VOID
State 21
17 tuposdedomenwn: INT .
28 typos_synartisis: INT .
ID reduce using rule 17 (tuposdedomenwn)
ID [reduce using rule 28 (typos_synartisis)]
$default reduce using rule 17 (tuposdedomenwn)
State 22
18 tuposdedomenwn: BOOL .
29 typos_synartisis: BOOL .
ID reduce using rule 18 (tuposdedomenwn)
ID [reduce using rule 29 (typos_synartisis)]
$default reduce using rule 18 (tuposdedomenwn)
I really have tried everything, but I can't remove the conflicts... Any ideas or suggestions are welcome!
Thanks a lot!!
The key is here: …
orismos → (13) orismosmetablitwn → (16) tuposdedomenwn listametablitwn
orismos → (14) orismossunartisis → (25) kefalidasunartisis … → (27) typos_synartisis …
orismos → (15) prwtotuposunartisis → (26) kefalidasunartisis … → (27) typos_synartisis …
Also:
tuposdedomenwn → NUMBER | BOOL | STRING
listametablitwn → ID …
kefalidasunartisis → typos_synartisis ID …
typos_synartisis → NUMBER | BOOL | VOID
So, here are some derivations from orismos
:
orismos → orismosmetablitwn → tuposdedomenwn listametablitwn → NUMBER ID …
orismos → orismossunartisis → kefalidasunartisis … → typos_synartisis … → NUMBER ID …
orismos → prwtotuposunartisis → kefalidasunartisis … → typos_synartisis … → NUMBER ID …
Say we are in a context where orismis
is possible, and we have just seen a NUMBER
and we are now looking at an ID
. All three of the above derivations are possible. But there is a small problem: in the first one, NUMBER
is derived from tuposdedomenwn
whereas in the second two it is derived from typos_synartisis
. Before we can do anything with the ID
, we need to know which of the two derivations to reduce, but there is no way we can know that until we see the token after the ID
.
That means the grammar needs two tokens of lookahead, so an LALR(1) parser cannot parse it.
The simplest solution would be to combine the two non-terminals representing sets of types, creating a non-terminal which allows all four possibilites:
tuposdedomenwn: NUMBER | BOOL | STRING | VOID
Then you will have to do a semantic check later on to verify that a variable is not declared VOID
nor a function declared STRING
. (Why can't your functions return strings? That seems a bit limiting.) This strategy of accepting a superscript of the language and then doing semantic checks during AST construction or even later is very common; for example, it is the only way to enforce declaration before use of variables.
But if you don't want to do the semantic check, you could simply defer the reduction by creating non-terminals consisting of a datatype and an ID
. That would lead to something like:
tupodedomenon_kai_metavlites
: INT ID
| BOOL ID
| STRING ID
| tupodedomenon_kai_metavlites ',' ID
tupo_synartisis_kai_metavliton
: INT ID
| BOOL ID
| VOID ID
orismos: orismosmetablitwn
| orismossunartisis
| prwtotuposunartisis
orismosmetablitwn
: tupodedomenon_kai_metavlites ';'
orismossunartisis
: kefalidasunartisis tmimaorismwn tmimaentolwn
prwtotuposunartisis
: kefalidasunartisis ';'
kefalidasunartisis
: tupo_synartisis_kai_metavliton '(' c ')'
With that change, there is no need to reduce the NUMBER
(or other type); the ID
can be shifted, and then the reduction to one or the other possibility will be performed based on the following token: as a variable declaration if the following token is ;
or ,
, and as a function declaration if the following token is (
.