Search code examples
parsingbisonyacclex

What decides which production the parser tries?


I am trying to build a parser for a desk calculator and am using the following bison code for it.

%union{
    float f;
    char c;
    // int 
}

%token <f> NUM
%token <c> ID
%type <f> S E T F G

%%

C   :   S ';'
    |   C S ';'
    ;

S   :   ID '=' E    {fprintf(debug,"13\n");printf("%c has been assigned the value %f.",$1,$3);symbolTable[$1]=$3;}
    |   E           {fprintf(debug,"12\n");result = $$;}
    ;

E   :   E '+' T     {fprintf(debug,"11\n");$$ = $1+$3;}
    |   E '-' T     {fprintf(debug,"10\n");$$ = $1-$3;}
    |   T           {fprintf(debug,"9\n");$$ = $1;}
    ;

T   :   T '*' F     {fprintf(debug,"7\n");$$ = $1*$3;}
    |   T '/' F     {fprintf(debug,"6\n");$$ = $1/$3;}
    |   F           {fprintf(debug,"5\n");$$ = $1;}
    ; 

F   :   G '@' F     {fprintf(debug,"4\n");$$ = pow($1,$3);}
    |   G           {fprintf(debug,"3\n");$$ = $1;}
    ;

G   :   '(' E ')'   {fprintf(debug,"2\n");$$ = $2;}
    |   NUM         {fprintf(debug,"1\n");$$ = $1;}
    |   ID          {fprintf(debug,"0\n");$$ = symbolTable[$1];}
    ;

%%

My LEX rules are

digit   [0-9]
num     {digit}+
alpha   [A-Za-z]
id      {alpha}({alpha}|{digit})*
white   [\ \t]

%%
let     {printf("let");return LET;}
{num}   {yylval.f = atoi(yytext);return NUM;}
{alpha} {yylval.c = yytext[0];return ID;}
[+\-\*/@\(\)]   {return yytext[0];}
.       {}
%%

The input I gave is a=2+3 When the lexer returns an ID(for 'a'), the parser is going for the production with fprintf(debug,"0\n"). But I want it to go for the production fprintf(debug,"13\n").

So, I am wondering what made my parser go for a reduction on production 0, instead of shifting = to stack, and how do I control it?


Solution

  • Well, it turns out that the problem is I am not identifying = as a token here, in my LEX.

    As silly as it sounds, it points out a very important concept of yacc/Bison. The question of whether to shift or reduce is answered by checking the next symbol, also called the lookahead. In this case, the lookahead was NUM(for 2) and not =, because of my faulty LEX code. Since there is no production involving ID followed by NUM, it is going for a reduction to G.

    And about how I figured it out, it turns out bison has a built-in trace feature. It lays out neatly like a diary entry, whatever it does while parsing. each and every step is written down.

    To enable it,

    Run bison with -Dparse.trace option.

    bison calc.y -d -Dparse.trace

    In the main function of parser grab the extern yydebug and set it to non-zero value.

    int main(){
       extern int yydebug;
       yydebug = 1;
       .
       .
       .
    }