Search code examples
bisonyacc

"obvious" shift/reduce error in 6 tokens or less


I have a grammar, illustrated below, in which statements have an error clause that can include an arbitrary number of statements. The error clause must be terminated by an END (like case/esac in the shell). In the absence of an error clause, the statement may be terminated, but need not be.

The bison report shows 2 s/r conflicts. In each case, the conflict is about what to do with the terminator, either:

  • shift, and go to the state that only reduces it, or
  • reduce it

That seems like a distinction without a difference to me, but I haven't been able to explain that to bison.

I'm sure I'm missing something basic. I'm hoping someone can explain what it is.

%token ERROR ARG

%right WALK RUN
%left  KLAW NUR 
                        
%%

statements:     statement
        |       statements statement
                ;

statement:      walk
        |       run
                ;

args:           ARG
        |       args ARG
                ;

on_error:       ERROR statements
                ;

walk:           WALK ARG args on_error KLAW
        |       WALK ARG args
        |       WALK ARG args          KLAW 
                ;

run:            RUN ARG args on_error NUR
        |       RUN ARG args          nur
                ;

nur:            %empty
        |       NUR
        ;

Here's part of the report I'm looking at:

State 12

    6 args: args . ARG
    8 walk: WALK ARG args . on_error KLAW
    9     | WALK ARG args .
   10     | WALK ARG args . KLAW

    ERROR  shift, and go to state 14
    ARG    shift, and go to state 15
    KLAW   shift, and go to state 16

    KLAW      [reduce using rule 9 (walk)]
    $default  reduce using rule 9 (walk)

    on_error  go to state 17
...
State 16

   10 walk: WALK ARG args KLAW .

    $default  reduce using rule 10 (walk)

Solution

  • Consider the following short program (six tokens, but I cheated by leaving out the ARGs, because they are just clutter. You can pretend that every WALK is followed by at least two ARGs, but it makes no difference to the problem. I also subscripted the WALKs for clarity)

    WALK-1 ERROR WALK-2 KLAW WALK-3 KLAW
    

    What does it mean? The options are (with KLAWs numbered to match their WALK):

        WALK-1
          ERROR WALK-2
        KLAW-1
        WALK-3 KLAW-3
    

    or

        WALK-1
          ERROR WALK-2 KLAW-2
                WALK-3
        KLAW-1
    

    I guess those have different semantics; in the first one WALK-3 always runs after WALK-1 while in the second one it only runs if there was an error in WALK-1.


    It's possible that you are misreading the state table Bison presents you with. In State 12, the conflicting options are to reduce using production 9 (walk: WALK ARG args) leaving KLAW to match some other WALK, or to shift KLAW and then reduce using production 10 (walk: WALK ARG args KLAW), which means that KLAW matches this WALK.


    It's also possible that you are left wondering why your precedence declarations had no effect.

    Precedence rules are used to resolve conflicts, such as that in State 12, where a lookahead token might either be shifted or retained until after one or more reductions. The resolution is to compare the precedence of the lookahead token (KLAW in this case) with the precedence of the production which could be reduced (walk: WALK ARG args). If the lookahead token has higher precedence, then the lookahead token is shifted; if the production has higher precedence, then the production is reduced. If they have the same precedence, then associativity is applied (reduce for %left; shift for %right).

    That's fine, but how does Bison compute the precedence of walk: WALK ARG args? The simple answer is that Bison uses the precedence of the last terminal in the production, in this case ARG. But ARG doesn't have a declared precedence, so the production's precedence is unspecified, and precedence resolution only applies if both the rule and the token have defined precedence.

    In order for precedence resolution to help, you would have to either:

    • Add ARG to your precedence declarations, or

    • use %prec (q.v.) to associate the production with a different terminal symbol.