Search code examples
bisonyacc

Resolve shift/reduce conflict in Yacc/Bison grammar


I am writing a parser in GNU-Bison to parse captured bits of data of a proprietary protocol. The parser has following tokens:

  • H ....... Header
  • D ....... Data
  • T ....... Terminator

and five numbers of D i.e. Data tokens make up a bulk B i.e.

B : DDDDD
  ;

Ideally the input should be of the form

H DDDDD DDDDD DDDDD ... DDDDD T

aka

 H B B B ... B T

So I have written the following grammar :

%%
CAPTURE :  H PAYLOAD T     { printf("[OK]");}
        ;

PAYLOAD :  B
        |  PAYLOAD B
        ;

B       :  DDDDDD
%%        

Now, to meet some practical conditions such as a pattern like following:

  1. DD H DDDDD DDDDD ... DDDDD T (extra D's at the prefix)
  2. H DDDDD DDDDD ... DDD T (truncated last bulk having just 3 Data D)

I modified my grammar to

%%
CAPTURE :  H PAYLOAD T     { printf("[OK]");}
        ;

PAYLOAD :  B
        |  PAYLOAD B
        |  D PAYLOAD
        |  PAYLOAD D
        ;

B       :  DDDDDD
%%        

But It is giving shift/reduce conflict.
Need help how to correct the grammar so that it recognises above two cases also and is shift/reduce conflict free.


Solution

  • %%
    CAPTURE : OPTD1 H PAYLOAD OPTD2 T     { printf("[OK]");}
            ;
    
    PAYLOAD :  B
            |  PAYLOAD B
            ;
    
    B       :  D D D D D
            ;
    
    OPTD1   :
            |  OPTD1 D
            ;
    
    OPTD2   :
            |  D D D
            ;
    %%     
    

    I added two new non-terminals OPTD1 and OPTD2 on the right-hand side of the first production and kept your original rule for PAYLOAD as it was. OPTD1 can be rewritten as 0 or more D terminal symbols and OPTD2 can be rewritten as 0 or 3 D terminal symbols.

    If your TOKENS, H, T and B were just the characters 'H', 'T' and 'B' respectively, you could easily recognize valid input with the following regular expression:

    '^D*H(DDDDD)+(DDD)?T$'
    

    In any event, you should be able to recognize valid input with a finite state automaton with the power of a pushdown automaton provided by YACC not being required.