Search code examples
compiler-constructionbisonyacclex

Find Where Syntax Error in Flex Yacc Happened


I am pretty new to lex and yacc.

I'm designing a compiler which can make three-address-code.

How can I found where syntax error happen in my code ?

after entering :

flex lexer.l
bison -dy parser.y
gcc lex.yy.c y.tab.c -o program.exe

I try this input :

{ int abc = 234 ; }

and then it gives me syntax error!

How can i Fix it?

This is my Lexer

lexer.l:

%{

#include "y.tab.h"
#include <string.h>
int yyerror(char *errormsg);

%}

letter  [a-zA-z]
digit   [0-9]
id      {letter}({letter}|{digit})*
ws      [ \t]


%%
{ws}        ;
\{          { return 300; }
\}          { return 301; }
\;          { return SEMICOLON; }
"if"        { return IF; }
"int"       { return INT; }
"float"     { return FLOAT; }
"char"      { return CHAR; }
\=          { return ASSIGN; }      
{id}        {strcpy(yylval.str,yytext) ; return ID; }
{digit}+    {yylval.ival=atoi(yytext); return NUMBER; }
.           {yyerror("Invalid Command");}
%%



int main(void)
{
   yyparse();
   printf("DONE");
   return 0;
}

int yywrap(void)
{
   return 0;
}

int yyerror(char *errormsg)
{
    fprintf(stderr, "hey!%s\n", errormsg);
    exit(1);
}

This is my Parser

parser.y:

%{

#include <stdio.h>
#include <stdlib.h>
#include<string.h>
int yylex(void);
int yyerror(const char *s);

%}


%union{int ival; double dval; char str[120]; }

%token INT ASSIGN NUMBER IF SEMICOLON
%token FLOAT
%token ID CHAR

%%

Program: 
        Block
        ;

Block:
        '{' Stmts '}'
        ;

Stmts:
        Stmts Stmt
        | Stmt
        ;

Stmt:
        Block
        |IfStmt
        |AssignStmt
        |DeclStmt
        ;


IfStmt:
        IF '(' Expr ')' Stmt  { printf("if found"); }
        ;


AssignStmt:     
        Type ID ASSIGN Expr SEMICOLON { printf("int found!"); }
        ;

DeclStmt:
        Type ID SEMICOLON
        ;


Type:
        INT
        |FLOAT
        |CHAR
        ;


Expr:
    NUMBER
    ;

Solution

  • The first thing you want to do when trying to figure out a syntax error with Bison is to add the %define parse.error verbose option to your Bison file. This will change the error message to something more helpful than just "syntax error". Note that this is a Bison-specific feature, so you'll need to remove the -y flag when calling Bison. Doing this, the error message will change to:

    syntax error, unexpected $undefined, expecting '{'
    

    So it's telling you that it got an $undefined when it expected a {. So what's an $undefined? It's how Bison displays any token whose name it doesn't know. If the token is an integer in ASCII, it'll be displayed as 'x' (where instead of x it'll be the given ASCII character). If the token has been defined using %token, it'll be displayed as the name associated with that %token declaration. Only when neither is the case, will you get $undefined.

    So your lexer returns something that is neither an ASCII character nor a defined token. So let's look at your lexer for anything like that and sure enough:

    \{          { return 300; }
    \}          { return 301; }
    

    When your lexer sees a brace, it will return 300 or 301 respectively. These are neither characters nor tokens defined using %token, so they mean nothing to Bison.

    Since your parser expects to see '{' and '}', the above should say return '{'; and return '}'; respectively (or return yytext[0]; in both cases if you prefer). Alternatively you could define %token LBRACE RBRACE in your parser, use those instead of '{' and '}' in the Block rule and return those in your lexer. Either way you definitely shouldn't return arbitrary integers in your lexer.


    You'll also want to return 1 instead of 0 in yywrap or get rid of it altogether using the noyywrap option. Returning 0 makes the lexer wait for further input after reaching the end of file.