compiler-construction grammar bison reduce-reduce-conflict

Bison reduce/reduce conflict in grammar

I'm building a grammar in bison and I have a r/r conflict (which I know where it is), but I don't know how to fix it. I would appreciate any possible help.

The part of my code that includes the conflict is:

orismos2: %empty
|orismos orismos2
|error {yyerrok;yyclearin;};

orismos: orismosmetablitwn
|orismossunartisis
|prwtotuposunartisis;

orismosmetablitwn: tuposdedomenwn listametablitwn SEMICOLON ;

tuposdedomenwn: INT
|BOOL
|STRING;

listametablitwn: ID nid ;

nid: %empty
|pid nid
|error {yyerrok;yyclearin;};

pid: COMMA ID ;

orismossunartisis: kefalidasunartisis tmimaorismwn tmimaentolwn;

prwtotuposunartisis: kefalidasunartisis SEMICOLON;

kefalidasunartisis: typos_synartisis ID OPENBRACKET c CLOSEBRACKET;

typos_synartisis: INT
|BOOL
|VOID;

I have make an output file, in which I can see all the conflicts.

The part of the file that includes the conflicts is:

State 21 conflicts: 1 reduce/reduce
State 22 conflicts: 1 reduce/reduce


Grammar

   10 orismos2: %empty
   11         | orismos orismos2
   12         | error

   13 orismos: orismosmetablitwn
   14        | orismossunartisis
   15        | prwtotuposunartisis

   16 orismosmetablitwn: tuposdedomenwn listametablitwn SEMICOLON

   17 tuposdedomenwn: INT
   18               | BOOL
   19               | STRING

   20 listametablitwn: ID nid

   21 nid: %empty
   22    | pid nid
   23    | error

   24 pid: COMMA ID

   25 orismossunartisis: kefalidasunartisis tmimaorismwn tmimaentolwn

   26 prwtotuposunartisis: kefalidasunartisis SEMICOLON

   27 kefalidasunartisis: typos_synartisis ID OPENBRACKET c CLOSEBRACKET

   28 typos_synartisis: INT
   29                 | BOOL
   30                 | VOID


State 21

   17 tuposdedomenwn: INT .
   28 typos_synartisis: INT .

    ID        reduce using rule 17 (tuposdedomenwn)
    ID        [reduce using rule 28 (typos_synartisis)]
    $default  reduce using rule 17 (tuposdedomenwn)


State 22

   18 tuposdedomenwn: BOOL .
   29 typos_synartisis: BOOL .

    ID        reduce using rule 18 (tuposdedomenwn)
    ID        [reduce using rule 29 (typos_synartisis)]
    $default  reduce using rule 18 (tuposdedomenwn)

I really have tried everything, but I can't remove the conflicts... Any ideas or suggestions are welcome!

Thanks a lot!!

Solution

The key is here: …

orismos → (13) orismosmetablitwn → (16) tuposdedomenwn listametablitwn
orismos → (14) orismossunartisis → (25) kefalidasunartisis … → (27) typos_synartisis …
orismos → (15) prwtotuposunartisis → (26) kefalidasunartisis … → (27) typos_synartisis …

Also:

tuposdedomenwn → NUMBER | BOOL | STRING
listametablitwn → ID …
kefalidasunartisis → typos_synartisis ID …
typos_synartisis → NUMBER | BOOL | VOID

So, here are some derivations from orismos:

orismos → orismosmetablitwn → tuposdedomenwn listametablitwn → NUMBER ID …
orismos → orismossunartisis → kefalidasunartisis … → typos_synartisis … → NUMBER ID …
orismos → prwtotuposunartisis → kefalidasunartisis … → typos_synartisis … → NUMBER ID …

Say we are in a context where orismis is possible, and we have just seen a NUMBER and we are now looking at an ID. All three of the above derivations are possible. But there is a small problem: in the first one, NUMBER is derived from tuposdedomenwn whereas in the second two it is derived from typos_synartisis. Before we can do anything with the ID, we need to know which of the two derivations to reduce, but there is no way we can know that until we see the token after the ID.

That means the grammar needs two tokens of lookahead, so an LALR(1) parser cannot parse it.

The simplest solution would be to combine the two non-terminals representing sets of types, creating a non-terminal which allows all four possibilites:

tuposdedomenwn: NUMBER | BOOL | STRING | VOID

Then you will have to do a semantic check later on to verify that a variable is not declared VOID nor a function declared STRING. (Why can't your functions return strings? That seems a bit limiting.) This strategy of accepting a superscript of the language and then doing semantic checks during AST construction or even later is very common; for example, it is the only way to enforce declaration before use of variables.

But if you don't want to do the semantic check, you could simply defer the reduction by creating non-terminals consisting of a datatype and an ID. That would lead to something like:

tupodedomenon_kai_metavlites
       : INT ID
       | BOOL ID
       | STRING ID
       | tupodedomenon_kai_metavlites ',' ID

tupo_synartisis_kai_metavliton
       : INT ID
       | BOOL ID
       | VOID ID

orismos: orismosmetablitwn
       | orismossunartisis
       | prwtotuposunartisis

orismosmetablitwn
       : tupodedomenon_kai_metavlites ';'

orismossunartisis
       : kefalidasunartisis tmimaorismwn tmimaentolwn

prwtotuposunartisis
       : kefalidasunartisis ';'

kefalidasunartisis
       : tupo_synartisis_kai_metavliton '(' c ')'

With that change, there is no need to reduce the NUMBER (or other type); the ID can be shifted, and then the reduction to one or the other possibility will be performed based on the following token: as a variable declaration if the following token is ; or ,, and as a function declaration if the following token is (.