Search code examples
cflex-lexercontext-free-grammarbisonc++

Why do I get syntax error for the expression evaluator program?


I tried the following input

a = 10;

print a;

print 1+2+3;

a = 5+10;

I get syntax error when I try to execute the file with above input There's no error during compilation

Here's the code

Flex

%{
/* header files */
%}

/* regex */

%option yylineno

%%


"println"       {  printf("token is println");  return(TOK_PRINTLN);}
"print" { printf("token is print"); return(TOK_PRINTLN); }

"main()" { return(TOK_MAIN); }
{digit}+    {  /* convert to int and store its val*/
              printf("token is %d", yylval.int_val);
            return INTEGER;
            }
{id} {
      /* convert to char */
      printf("token is %c", yylval.id_val);
      return(TOK_ID);
     } 

";" {   return(TOK_SEMICOLON);  }
"+" {   return(TOK_ADD);    }
"-" {   return(TOK_SUB);    } /* not req  */
"*" {   return(TOK_MUL);    }
"/" {   return(TOK_DIV);    } /* not req  */
"=" { return(TOK_EQ);   }
[ \t\n]     {printf("token is space");}

.   {printf("Invalid character '%c', ignored\n", 
        yytext[0]);
    }

%%

For bison, we use a symbol table which is an array. We get the variable (identifier represented by TOK_ID) and convert it to index where we can store value of the expression.

Bison File

%{
  /* header file and fucn dec*/
%}


%union{
    int int_val;
    char id_val;       
      }

/* tokens and types */ %start stmt

%right TOK_EQ
%left TOK_ADD TOK_SUB
%left TOK_MUL TOK_DIV


%%

 /* grammar */

stmt: expr_stmt TOK_SEMICOLON
     {; /* do nothing*/
     }
     | TOK_PRINTLN expr TOK_SEMICOLON 
     {
      printf("%d \n",$2);
     }
     | stmt TOK_PRINTLN expr TOK_SEMICOLON
     {
     printf("%d \n",$3);
     }
     | stmt expr TOK_SEMICOLON
     {
     ;
     }
;

expr_stmt: TOK_ID TOK_EQ expr
       {
        setSTVal($1, $3);
       }
;

expr:
   /*expr stuff */

;

%%

int getSTIndex(char c){
  /* return index*/
}

int getSTVal(char c){
  /* set val*/
}

void setSTVal(char c, int v){
   /* set table val*/
 }


 int yyerror(char *s)
{

 printf("\nsyntax error on line no %d\n",yylineno);
return 0;
 }

void initializeSymbolTable(){
    for(int i=0; i<100; i++)symbol_table[i] = 0; /*avoiding garbage val*/
         /* initializn stuff */
 }

 int main()
 {
  initializeSymbolTable();
  yyparse(); /* C routine produced by lex */
  return 0;
  }

When I tried to debug with input a=5; and a = 5; It could capture the token a but it threw syntax error after that It coudnt capture = and everything after that.

I can't figure out why it captures only the first digit/command/string and then throws syntax error


Solution

  • If I simplify your grammar a bit to

    /* ... */
    %start input
    /* ... */
    
    input: /* empty file/no input */
         | input stmt
    
    /* each statement is an "expr" followed by a semicolon */
    stmt: expr TOK_SEMICOLON
         {
           ;
         }
         /* This is a function and should go into the rule "expr", too, btw. */
         | TOK_PRINTLN expr TOK_SEMICOLON 
         {
          printf("%d \n",$2);
         }
    ;
    
    
    expr: /* empty expression */
        expr TOK_ADD expr
        {
        $$ = $1 + $3;
        }
        /* ... */
        | INTEGER
        {
         $$ = $1;
        };
        | TOK_ID
        {
         $$ =  getSTVal($1);
        }
        | TOK_ID TOK_EQ expr
        {
          setSTVal($1, $3);
        }
      ;
    

    It works with the input file

    a = 10;
    
    print
       a;
    
    print 1+2+3;
    
    a = 5
         +
          10;
    print a;
    

    as expected. It is not very elegant but should point you in the right direction.

    Your problem was that TOK_ID is in two rules and the second occurrence of TOK_EQ happened when the parser was in expr and there is no rule for TOK_ID TOK_EQ only for TOK_ID alone. (it is a bit more complicated than that, admitted)

    If you have the Bison documentation at hand you might look for the mfcalc example.