I am pretty new to lex and yacc.
I'm designing a compiler which can make three-address-code.
How can I found where syntax error happen in my code ?
after entering :
flex lexer.l
bison -dy parser.y
gcc lex.yy.c y.tab.c -o program.exe
I try this input :
{ int abc = 234 ; }
and then it gives me syntax error!
How can i Fix it?
This is my Lexer
lexer.l:
%{
#include "y.tab.h"
#include <string.h>
int yyerror(char *errormsg);
%}
letter [a-zA-z]
digit [0-9]
id {letter}({letter}|{digit})*
ws [ \t]
%%
{ws} ;
\{ { return 300; }
\} { return 301; }
\; { return SEMICOLON; }
"if" { return IF; }
"int" { return INT; }
"float" { return FLOAT; }
"char" { return CHAR; }
\= { return ASSIGN; }
{id} {strcpy(yylval.str,yytext) ; return ID; }
{digit}+ {yylval.ival=atoi(yytext); return NUMBER; }
. {yyerror("Invalid Command");}
%%
int main(void)
{
yyparse();
printf("DONE");
return 0;
}
int yywrap(void)
{
return 0;
}
int yyerror(char *errormsg)
{
fprintf(stderr, "hey!%s\n", errormsg);
exit(1);
}
This is my Parser
parser.y:
%{
#include <stdio.h>
#include <stdlib.h>
#include<string.h>
int yylex(void);
int yyerror(const char *s);
%}
%union{int ival; double dval; char str[120]; }
%token INT ASSIGN NUMBER IF SEMICOLON
%token FLOAT
%token ID CHAR
%%
Program:
Block
;
Block:
'{' Stmts '}'
;
Stmts:
Stmts Stmt
| Stmt
;
Stmt:
Block
|IfStmt
|AssignStmt
|DeclStmt
;
IfStmt:
IF '(' Expr ')' Stmt { printf("if found"); }
;
AssignStmt:
Type ID ASSIGN Expr SEMICOLON { printf("int found!"); }
;
DeclStmt:
Type ID SEMICOLON
;
Type:
INT
|FLOAT
|CHAR
;
Expr:
NUMBER
;
The first thing you want to do when trying to figure out a syntax error with Bison is to add the %define parse.error verbose
option to your Bison file. This will change the error message to something more helpful than just "syntax error". Note that this is a Bison-specific feature, so you'll need to remove the -y
flag when calling Bison. Doing this, the error message will change to:
syntax error, unexpected $undefined, expecting '{'
So it's telling you that it got an $undefined
when it expected a {
. So what's an $undefined
? It's how Bison displays any token whose name it doesn't know. If the token is an integer in ASCII, it'll be displayed as 'x'
(where instead of x
it'll be the given ASCII character). If the token has been defined using %token
, it'll be displayed as the name associated with that %token
declaration. Only when neither is the case, will you get $undefined
.
So your lexer returns something that is neither an ASCII character nor a defined token. So let's look at your lexer for anything like that and sure enough:
\{ { return 300; }
\} { return 301; }
When your lexer sees a brace, it will return 300 or 301 respectively. These are neither characters nor tokens defined using %token
, so they mean nothing to Bison.
Since your parser expects to see '{'
and '}'
, the above should say return '{';
and return '}';
respectively (or return yytext[0];
in both cases if you prefer). Alternatively you could define %token LBRACE RBRACE
in your parser, use those instead of '{'
and '}'
in the Block
rule and return those in your lexer. Either way you definitely shouldn't return arbitrary integers in your lexer.
You'll also want to return 1 instead of 0 in yywrap
or get rid of it altogether using the noyywrap
option. Returning 0 makes the lexer wait for further input after reaching the end of file.